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PREFACE 

The  present  text  is  just  what  its  title  claims,  namely,  a  text  on  logic 
written  for  the  mathematician.  The  text  starts  from  first  principles,  not 
presupposing  any  previous  specific  knowledge  of  formal  logic,  and  tries  to 
cover  thoroughly  all  logical  questions  which  are  of  interest  to  a  practicing 
mathematician. 

We  use  symbolic  logic  in  the  present  text,  because  we  do  not  know  how 
otherwise  to  attain  the  desired  precision.  Any  reader  of  the  text  must  per- 
force become  a  competent  operator  in  symbolic  logic.  However,  for  us  this 
is  only  a  means  to  an  end,  and  not  an  end  in  itself.  Indeed,  to  mitigate  the 
difficulties  of  learning  and  operating  the  symbohc  logic,  we  have  intro- 
duced some  novelties.  These  may  be  of  interest  to  students  of  logic,  but 
introduction  of  logical  novelties  is  not  any  part  of  the  aim  of  this  text.  We 
seek  to  convey  to  mathematicians  a  precise  knowledge  of  the  logical  princi- 
ples which  they  use  in  their  daily  mathematics,  and  to  do  so  as  quickly  as 
possible.  In  this  respect,  we  feel  that  the  present  text  is  unique. 

Modern  logic  has  become  a  large  and  diversified  field  of  stud}^,  with  many 
well-developed  branches.  Many  of  these  branches  have  little  value  to  the 
mathematician  as  a  tool  for  mathematical  reasoning.  A  text  on  such  a 
branch  of  logic,  however  excellent,  would  be  of  little  interest  to  a  reader 
who  is  primarily  a  mathematician.  Contrariwise,  certain  topics  of  great 
value  as  tools  for  mathematical  reasoning  have  little  interest  for  students 
of  logic,  and  are  almost  never  treated  in  books  on  logic.  Thus  it  happens 
that  among  the  many  books  on  logic,  none  is  completely  suitable  for  the 
mathematician. 

One  of  the  most  suitable  is  the  epoch-making  "Principia  Mathematica" 
of  Whitehead  and  Russell.  The  subject  matter  in  'Trincipia  Mathematica" 
was  admirably  chosen  for  the  needs  of  mathematicians,  and  we  have  fol- 
lowed this  text  closely  with  regard  to  subject  matter.  We  have  omitted 
a  few  topics  which  seem  to  be  little  used  nowadays,  and  instead  have  in- 
cluded treatments  of  such  new  developments  as  Zorn's  lemma.  We  have 
improved  on  the  symbolic  machinery  of  'Trincipia  Mathematica,"  which 
is  out  of  date  and  extremely  unwieldy.  By  using  techniques  invented  since 
its  writing,  we  have  succeeded  in  condensing  most  of  "Principia  Mathe- 
matica's"  three  large  volumes  into  the  present  text. 

Since  familiar  logical  principles  often  look  very  strange  in  the  garb  of 
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symbolic  logic,  we  have  included  a  large  number  of  pertinent  examples  of 
ordinary  mathematical  reasoning  handled  by  symbolic  means.  This  should 
help  the  reader  to  apply  the  principles  of  this  text  to  his  own  problems  in 
mathematical  reasoning. 

Although  the  present  text  is  complete  and  does  not  presuppose  any  previ- 
ous acquaintance  with  logic,  it  is  written  for  the  mathematician  with  some 
maturity.  For  one  thing,  the  illustrative  examples  are  chosen  from  a 
variety  of  fields  of  mathematics,  and  their  point  will  be  lost  on  the  mathe- 
matically immature. 

By  including  numerous  exercises,  we  have  tried  to  make  this  text  suitable 
for  classroom  instruction,  and  have  used  it  this  way  ourselves.  With  a 
teacher  to  help,  less  maturity  is  needed  on  the  part  of  the  reader  than  if  he 
is  reading  it  alone.  However,  even  with  a  teacher  to  help,  it  is  recom- 
mended that  the  student  should  have  had  some  mathematics  beyond  the 
calculus,  preferably  a  course  in  which  some  attention  was  paid  to  careful 
mathematical  reasoning.  Let  us  recall  that  this  text  attempts  to  treat  all 
logical  principles  which  are  useful  in  modern  mathematics,  and  unless  the 
reader  has  some  acquaintance  with  the  mathematical  fields  in  which  the 
principles  are  to  be  used,  he  will  find  a  study  of  the  principles  alone  rather 
sterile.  It  is  in  the  hope  of  counteracting  such  sterility  that  we  have  in- 
cluded so  many  illustrations  of  reasoning  from  standard  mathematical  texts. 

We  are  vastly  indebted  to  the  many  logicians  with  whom  we  have  been 
associated  in  the  past  twenty  years  as  well  as  to  the  many  others  whose 
writings  we  have  read.  This  debt  is  only  partially  indicated  by  the  titles 
in  our  bibliography.  Almost  equally  important  for  the  present  text  have 
been  the  many  suggestions  from  mathematicians  who  are  not  primarily 
logicians,  but  who  have  been  kind  enough  to  tell  us  of  logical  questions 
which  they  would  like  to  see  answered.  We  hope  that  they  will  find  them 
answered  in  the  present  text. 

Those  theorems,  or  parts  of  theorems,  or  corollaries,  which  are  referred 
to  at  least  five  times  in  later  sections  are  marked  with  a  *.  Those  of 
particular  importance  are  marked  with  a  **. 

At  the  end  of  the  present  text  there  is  a  bibliography  arranged  alpha- 
betically according  to  the  names  of  the  authors.  References  to  items  having 
a  single  author  are  made  by  giving  the  author's  name  and  the  date  of  the 
item,  as  "Hardy,  1947."  In  the  one  case  where  this  is  ambiguous,  we  use 
"Zermelo,  1908,  first  paper"  and  "Zermelo,  1908,  second  paper."  Refer- 
ences to  items  having  two  authors  are  made  by  giving  the  names  of  the 
authors,  as  "Hardy  and  Wright." 

J.  Barkley  Rosser 

Ithaca,  N.Y. 
September,  1952 
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CHAPTER  I 
WHAT  IS  SYMBOLIC  LOGIC? 

1.  A  Hypothetical  Interview.  We  wish  to  record  an  imaginary  interview 
between  a  modern  mathematician  and  one  of  past  times.  Our  mathe- 
matician of  the  past  will  be  Descartes,  but  we  should  like  to  leave  our 
modern  mathematician  anonymous;  in  the  classic  tradition  of  mathematics, 
we  shall  refer  to  him  as  Professor  X.  We  imagine  Professor  X  equipped 
with  a  time-traveling  machine,  so  that  he  can  go  back  to  chosen  points  in 
time  and  interview  various  famous  mathematicians  of  the  past.  Professor 
X  elects  to  go  back  to  a  time  just  after  the  invention  of  coordinate  geometry 
by  Descartes  and  to  have  an  interview  with  Descartes  about  his  new 
invention.  Professor  X  takes  with  him  a  gift  of  several  reams  of  coordinate 
paper,  together  with  a  supply  of  mechanical  pencils  and  erasers,  which  so 
impress  Descartes  that  he  is  very  cordial.  They  discourse  on  many  matters, 
of  which  we  shall  record  only  their  discussion  of  continuous  curves. 

They  define  a  curve  as  continuous  if  it  can  be  drawn  without  lifting  the 
pencil  from  the  paper.  Descartes,  fascinated  with  his  pencils  and  paper, 
draws  a  large  number  of  curves  and  classifies  them  into  continuous  and 
discontinuous.  Fortified  with  his  knowledge  of  early  twentieth  century 
mathematics.  Professor  X  is  able  to  suggest  many  interesting  curves  and 
even  manages  to  trick  Descartes  at  first  with  some  special  curves  like 

sin  X 

which  is  not  defined  at  rr  =  0  and  so  has  a  gap  there  which  makes  it  dis- 
continuous. However,  Descartes,  being  a  clever  mathematician,  soon 
catches  on  to  all  Professor  X's  tricks  and  can  quickly  and  unerringly 
classify  even  the  most  complicated  curves  as  continuous  or  discontinuous. 
Needless  to  say.  Professor  X  is  familiar  with  the  modern  precise  definition 
of  continuity: 

(1)  A  function  /  is  continuous  at  x  =  a  if  f(a)  is  defined  and  unique, 
and  if  for  each  positive  e  there  is  a  positive  8  such  that  whenever 
I  a:  —  a  I  <  5  it  follows  that  |  f(x)  —  f(a)  \  <  e. 

(2)  A  function  /  is  continuovis  if  it  is  continuous  at  x  =  a  for  each  value 
of  a. 

Professor  X  decides  to  acquaint  Descartes  with  this  definition  with  the 
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intention  of  persuading  him  to  adopt  it  in  place  of  the  vague  intuitive  idea  of 
tracing  a  curve  without  Hfting  the  pencil  from  the  paper.  He  decides  further 
that  he  cannot  argue  in  favor  of  his  precise  z-8  definition  on  the  basis  that 
it  is  more  useful  for  deciding  whether  a  curve  (or  function)  is  continuous. 
Already,  with  his  vague  definition  of  continuity,  Descartes  can  decide 
which  curves  are  continuous  and  can  do  so  quickly  and  correctly.  As  a 
matter  of  fact.  Professor  X  realizes  that  he  himself  usually  decides  whether 
a  function  is  continuous  by  visualizing  if  its  graph  can  be  drawn  with  a 
continuous  pencil  stroke  and  only  uses  the  e-5  definition  of  continuity 
to  prove  the  conclusion  which  he  has  reached  by  visualizing  the  graph. 
Clearly  then,  the  value  of  the  e-8  definition  lies  mainly  in  proving  things 
about  continuity  and  only  slightly  in  deciding  things  about  continuity. 
Professor  X  reflects  that  the  situation  is  quite  analogous  to  that  in  early 
twentieth  century  mathematical  circles  where,  if  one  has  a  difficult  mathe- 
matical problem,  one  is  apt  to  proceed  quite  intuitively,  interchanging 
limits  of  integration,  differentiating  under  the  integral  sign,  etc.,  in  hopes 
of  guessing  an  answer.  Only  after  one  has  guessed  an  answer,  and  wishes  to 
verify  it  beyond  doubt,  does  one  bring  in  the  precise  definitions,  the  e's  and 
5's,  and  the  other  powerful  machinery  of  modern  mathematics.  For  getting 
answers,  it  is  better  to  use  intuitive  arguments,  even  rather  vague  ones. 
For  proving  answers,  only  rigid,  formal  arguments  can  be  trusted. 

Professor  X  thinks  of  an  analogous  situation  which  he  can  present  to 
Descartes.  For  the  Egyptian  originators  of  geometry,  geometric  concepts 
were  quite  vague.  A  straight  line  was  a  stretched  string;  parallel  lines  were 
wagon  tracks;  etc.  This  vagueness  did  not  prevent  the  Egyptians  from 
discovering  many  useful  geometric  theorems  but  made  it  quite  impossible 
for  them  to  prove  them.  However,  the  Greeks  introduced  the  precise  ideas 
of  abstract  straight  lines,  etc.,  and  were  thus  enabled  to  devise  proofs  of 
geometric  theorems.  The  great  increase  of  geometric  knowledge  with  the 
Greeks  makes  it  hard  to  believe  that  the  increased  precision  was  not  also 
of  value  in  discovering  geometric  theorems  as  well  as  proving  them. 

Actually,  Professor  X  found  Descartes  very  agreeable  to  his  suggestions 
and  quite  willing  to  replace  his  vague  idea  of  continuity  by  a  precise  one. 
However,  Descartes  raised  one  difficulty  which  Professor  X  had  not  fore- 
seen.   Descartes  put  it  as  follows. 

"I  have  here  an  important  concept  which  I  call  continuity.  At  present 
my  notion  of  it  is  rather  vague,  not  sufficiently  vague  that  I  cannot  decide 
which  curves  are  continuous,  but  too  vague  to  permit  of  careful  proofs. 
You  are  proposing  a  precise  definition  of  this  same  notion.  However, 
since  my  definition  is  too  vague  to  be  the  basis  for  a  careful  proof,  how  are 
we  going  to  verify  that  my  vague  definition  and  your  precise  definition  are 
definitions  of  the  same  thing?" 
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If  by  "verify"  Descartes  meant  "prove,"  it  obviously  could  not  be  done, 
since  his  definition  was  too  vague  for  proof.  If  by  "verify"  Descartes 
meant  "decide,"  then  it  might  be  done,  since  his  definition  was  not  too 
vague  for  purposes  of  coming  to  decisions.  Actually,  Descartes  and  Pro- 
fessor X  did  finally  decide  that  the  two  definitions  were  equivalent,  and 
they  arrived  at  the  decision  as  follows.  Descartes  had  drawn  a  large  num- 
ber of  curves  and  classified  them  into  continuous  and  discontinuous,  using 
his  vague  definition  of  continuity.  He  and  Professor  X  checked  through  all 
these  curves  and  classified  them  into  continuous  and  discontinuous  using 
the  £-5  definition  of  continuity.  Both  definitions  gave  the  same  classifica- 
tion. As  these  were  all  the  interesting  curves  tha,t  either  of  them  had  been 
able  to  think  of,  the  evidence  seemed  "conclusive"  that  the  two  definitions 
were  equivalent. 

2.  The  Role  of  Symbolic  Logic.  When  Professor  X  returned  to  the 
present,  he  related  these  matters  to  us.  We  said  that  we  were  reminded 
of  the  situation  with  respect  to  symbolic  logic.  Professor  X  suggested  that, 
as  he  knew  nothing  about  symbolic  logic,  the  connection  could  hardly  be 
apparent  to  him,  and  he  asked  if  we  could  explain  without  getting  too 
complicated.    We  replied  as  follows. 

Suppose  Professor  X  wishes  to  prove  that  from  assumption  A  he  can 
deduce  conclusion  Z.  How  does  he  proceed?  The  most  straightforward 
way  is  to  observe  that  -B  is  a  logical  consequence  of  A,  then  C  is  a  logical 
consequence  of  B,  and  so  on  until  he  comes  to  Z.  For  this,  it  is  required 
not  only  that  Professor  X  be  able  to  discover  the  sequence  of  statements 
B,  C,  .  .  .  ,  but  that  he  be  able  to  decide  that  each  is  a  logical  consequence 
of  the  preceding.  One  thing  that  symbolic  logic  does  is  give  a  precise 
definition  of  when  one  statement  is  a  logical  consequence  of  another  state- 
ment. To  get  the  connection  with  Descartes,  we  set  up  an  analogy  as 
follows.  A  step  of  Professor  X's  proof  (such  as  deducing  B  from  A,  or  C 
from  B,  etc.)  is  to  correspond  to  one  of  Descartes's  curves.  Deciding 
whether  the  step  is  logically  correct  or  not  is  to  correspond  to  deciding 
whether  the  curve  is  continuous  or  not.  To  continue  the  analogy,  we  note 
that  Professor  X  is  quite  skillful  at  deciding  when  a  step  is  logically  correct, 
just  as  Descartes  was  quite  skillful  at  deciding  when  a  curve  is  continuous. 
Moreover,  Professor  X  bases  his  decisions  on  a  rather  vague  intuitive  notion 
of  logical  correctness,  just  as  Descartes  based  his  decisions  on  a  rather  vague 
intuitive  notion  of  continuity.  Furthermore,  the  vague  intuitive  notion  of 
logical  correctness  is  adequate  for  deciding  about  the  correctness  of  a 
logical  step,  just  as  the  vague  intuitive  notion  of  continuity  was  adequate 
for  deciding  about  the  continuity  of  a  curve.  If  one  wishes  to  prove  the 
correctness  of  a  logical  step,  a  precise  definition  of  logical  correctness  will 
be  needed,  just  as  a  precise  definition  of  continuity  was  needed  before 
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Descartes  could  prove  a  curve  to  be  continuous.  Finally,  symbolic  logic 
furnishes  a  precise  definition  of  logical  correctness  and  so  is  analogous  to 
the  £-5  definition  of  continuity,  which  furnishes  a  precise  definition  of 
continuity. 

"Why  do  you  think  my  notion  of  logical  correctness  is  rather  vague  and 
intuitive?"  asked  Professor  X.  "I  admit  that  I  very  seldom  justify  the 
logic  involved  in  my  proofs,  but  that  doesn't  prove  that  I  can't.  After  all, 
I  took  two  years  of  good  stiff  courses  in  logic  under  the  chairman  of  the 
philosophy  department  back  in  '27-'29." 

Our  reply  was  that  classical  logic  was  quite  inadequate  for  mathematical 
reasoning,  being  particularly  weak  in  treating  functions,  use  of  infinite 
classes,  and  other  matters  of  great  importance  in  mathematics.  As  a  matter 
of  fact,  the  first  treatment  of  logic  adequate  for  use  in  modern  mathematics 
was  the  famous  'Trincipia  Mathematica"  of  Whitehead  and  Russell  (see 
Whitehead  and  Russell). 

Professor  X  admitted  that  his  two  years  of  logic  had  been  of  very  little 
use  in  mathematics.  He  further  admitted  that  he  had  no  notion  how  to 
give  a  precise  definition  of  logical  correctness.  Nevertheless,  he  had  always 
been  able  to  tell  which  proofs  were  valid  and  which  were  not.  What  would 
he  gain  by  learning  a  precise  definition  of  logical  correctness? 

We  countered  by  referring  him  back  to  his  interview  with  Descartes. 
What  would  he  have  said  if  Descartes  had  answered  in  similar  fashion  that 
he  had  been  getting  along  very  well  with  a  vague  definition  of  continuity 
and  had  no  need  of  a  precise  definition? 

This  seemed  to  satisfy  Professor  X.  However,  he  had  one  further  ques- 
tion to  ask. 

''I  should  like  to  ask  the  same  question  that  Descartes  asked.  You  are 
proposing  to  give  a  precise  definition  of  logical  correctness  which  is  to  be 
the  same  as  my  vague  intuitive  feeling  for  logical  correctness.  How  do  you 
intend  to  show  that  they  are  the  same?" 

This  is  not  merely  Professor  X's  question.  It  should  be  the  question  of 
every  reader  of  the  present  text. 

Actually,  not  all  mathematicians  have  exactly  the  same  notion  of  logical 
correctness.  Mathematics  is  a  living,  growing  subject,  and  mathematicians 
do  not  all  work  in  the  same  branch  of  mathematics.  Often  mathematicians 
in  one  branch  of  mathematics  make  constant  use  of  some  logical  principle 
which  is  regarded  with  distrust  by  mathematicians  in  other  branches. 
The  axiom  of  choice,  to  which  we  shall  devote  a  chapter  of  discussion,  is 
such  a  principle. 

However,  there  is  a  sort  of  "common  denominator"  of  notions  of  logical 
correctness,  and  we  claim  to  give  a  symbolic  logic  which  is  a  precise  defini- 
tion of  logical  correctness  which  agrees  with  this  "common  denominator." 
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Our  symbolic  logic  is  accordingly  incomplete.  In  the  case  of  a  principle  like 
the  axiom  of  choice,  which  is  in  dispute  among  mathematicians,  our  sym- 
bolic logic  deliberately  fails  to  classify  it  as  either  correct  or  incon-ect, 
leaving  the  individual  reader  free  to  make  whichever  decision  pleases  him 
most.  However,  we  do  attempt  to  convince  the  reader  that  logical  princi- 
ples which  are  judged  correct  by  the  great  majority  of  mathematicians  are 
classified  as  correct  by  our  symbolic  logic  and  that  principles  which  are 
judged  incorrect  by  the  great  majority  of  mathematicians  are  classified  as 
incorrect  by  our  symbolic  logic. 

Our  procedure  for  doing  this  has  already  been  foreshadowed  in  the  inter- 
view between  Descartes  and  Professor  X.  Just  as  they  decided  to  accept 
the  equivalence  of  the  intuitive  and  precise  definitions  of  continuity  because 
these  definitions  agreed  in  a  large  number  of  cases,  even  so  a  reader  might 
be  convinced  that  our  symbolic  logic  agrees  with  his  intuitive  notions  of 
logical  correctness  if  he  is  shown  that  they  agree  in  a  large  number  of  cases. 
Accordingly,  we  shall  give  a  large  number  and  wide  variety  of  illustrations 
of  mathematical  reasoning  and  show  how  to  classify  each  as  correct  or 
incorrect  on  the  basis  of  our  symbolic  logic.  We  have  tried  to  choose  our 
illustrations  from  well-known  sources,  so  that  there  would  be  no  doubt 
about  the  general  opinion  of  mathematicians  as  to  the  correctness  or  in- 
correctness of  the  reasoning  in  our  illustrations.  With  the  general  opinion 
on  the  correctness  agreeing  with  our  symbolic  logic  in  a  wide  variety  of 
cases,  we  feel  that  most  readers  will  be  convinced. 

For  the  benefit  of  any  professional  skeptics,  we  admit  here  and  now  that 
certainly  no  number  of  illustrations  could  ever  suffice  to  carry  absolute 
conviction. 

The  symbolic  logic  which  we  present  is  a  modernized  version  of  that 
presented  in  the  "Principia  Mathematica"  of  Whitehead  and  Russell.  We 
have  altered  the  form  of  the  system  somewhat,  using  a  greatly  simplified 
version  of  the  theory  of  types  due  to  Quine  (see  Quine,  1937) .  Minor  details 
have  been  adjusted  to  bring  them  into  line  with  common  mathematical 
usage.  Simplifications  and  improvements  of  the  proofs  have  been  adopted 
from  numerous  sources.  We  have  not  attempted  to  list  these  sources, 
since  in  the  present  text  we  are  not  concerned  with  the  genesis  of  the  logic 
but  with  its  applications.  Persons  interested  in  the  connections  of  this 
symbolic  logic  with  others  may  consult  such  works  as  Hilbert  and  Bernaj^s; 
Church,  1944;  Quine,  1951;  and  Hilbert  and  Ackermann. 

3.  General  Nature  of  Symbolic  Logic.  The  aim  in  constructing  our 
symbolic  logic  is  that  it  shall  serve  as  a  precise  criterion  for  determining 
whether  or  not  a  given  instance  of  mathematical  reasoning  is  correct.  The 
symbolic  logic  which  we  shall  present  is  primarily  intended  to  be  a  tool  in 
mathematical  reasoning.    Of  course,  many  of  the  logical  principles  involved 
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have  general  application  outside  of  mathematics,  but  there  are  many  fields 
of  human  endeavor  in  which  these  principles  are  of  little  value.  Politics, 
salesmanship,  ethics,  and  many  such  fields  have  little  or  no  use  for  the  sort 
of  logic  used  in  mathematics,  and  for  these  our  symbolic  logic  would  be  quite 
useless.  In  engineering  and  science,  particularly  those  branches  of  science 
which  make  extensive  use  of  mathematics,  the  symbolic  logic  might  be  of 
considerable  value.  However,  it  would  be  fairly  inadequate  for  the  logical 
needs  of  even  the  most  mathematical  sciences.  For  one  thing,  no  adequate 
symbolic  treatment  of  the  relationship  involving  cause  and  eff"ect  has  yet 
been  devised.  However,  if  one  is  satisfied  to  restrict  attention  to  purely 
mathematical  reasoning,  several  quite  satisfactory  symbolic  logics  are 
available.    We  present  one  such  in  the  present  text. 

The  components  of  mathematical  reasoning  are  mathematical  statements. 
So,  in  building  a  symbolic  logic,  we  must  start  with  a  precise  definition  of 
what  a  mathematical  statement  is.  Intuitively,  we  can  say  that  it  is 
merely  a  declarative  sentence  dealing  exclusively  with  mathematical  and 
logical  matters.  Needless  to  say,  it  need  not  be  true.  "3  is  a  prime"  and 
"6  is  a  prime"  are  both  mathematical  statements,  the  first  true,  and  the 
second  false. 

Because  all  existing  languages  are  full  of  words  with  multiple  or  ambigu^ 
ous  meanings,  it  was  found  necessary  to  construct  a  complete  new  language 
in  order  to  be  able  to  give  a  precise  definition  of  "mathematical  statement." 
This  language  is  called  symbolic  logic.  In  order  to  aid  the  reader  in  learning 
this  new  language,  we  shall  introduce  him  to  it  gradually  over  several 
chapters.  Our  discussions  will  be  rather  general  and  descriptive  at  first, 
becoming  more  and  more  exact.  Correspondingly  our  notion  of  a  mathe- 
matical statement  will  at  first  be  merely  the  vague  notion  of  a  declarative 
sentence  but  will  gradually  be  sharpened.  Finally  in  Chapter  IX  we  shall 
have  developed  our  symbolic  logic  sufficiently  to  be  able  to  give  a  precise 
definition  of  a  mathematical  statement. 

We  shall  drop  the  "mathematical"  and  henceforth  refer  to  a  mathematical 
statement  merely  as  a  "statement." 

Once  a  precise  definition  of  "statement"  has  been  given  (see  Chapter  IX), 
one  can  give  a  precise  definition  of  "valid  statement"  and  of  "demonstra- 
tion." A  demonstration  shall  be  a  sequence  of  statements  such  that  each 
statement  is  either  already  known  to  be  valid  or  is  an  assumption  or  is 
derived  from  previous  statements  of  the  sequence  in  a  specified  fashion. 
The  analogy  with  the  usual  form  of  mathematical  demonstration  is  quite 
intentional.  Certain  statements,  designated  as  "axioms,"  are  taken  to  be 
valid,  and  then  any  other  statement  is  called  "valid"  if  it  is  the  final  state- 
ment in  a  demonstration  that  involves  no  assumptions,  that  is,  that  pro- 
ceeds from  axioms  alone. 
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Our  definitions  of  "axiom"  and  "demonstration"  will  be  carefully  and 
intentionally  framed  so  that  they  depend  only  on  the  forms  of  the  state- 
ments involved,  and  not  in  the  least  on  the  meanings.  Thus  the  decision 
as  to  whether  a  statement  is  an  axiom  or  whether  a  sequence  of  statements  is 
a  demonstration  depends  not  on  intelligence,  but  on  clerical  skill.  One 
could  build  a  machine  which  would  be  quite  capable  of  making  these 
decisions  correctly.  That  is,  one  could  build  a  machine  which  would  check 
the  logical  correctness  of  any  given  proof  of  a  mathematical  theorem.  That 
the  check  is  mechanical  does  not  mean  that  it  requires  no  intelligence  at  all. 
There  are  many  machines  of  a  sufficient  complexity  that  at  least  a  low 
order  of  intelligence  is  required  to  match  their  performance.  In  the  present 
case,  the  ability  to  perform  simple  arithmetical  computations  is  enough  to 
check  axioms  and  demonstrations,  as  was  shown  by  Godel  (see  Godel,  1931), 
who  put  the  definitions  into  an  arithmetical  form.  Thus,  a  person  with 
simple  arithmetical  skills  can  check  the  proofs  of  the  most  diflEicult  mathe- 
matical demonstrations,  provided  that  the  proofs  are  first  expressed  in 
symbolic  logic.  This  is  due  to  the  fact  that,  in  symbolic  logic,  demonstra- 
tions depend  only  on  the  forms  of  statements,  and  not  at  all  on  their 
meanings. 

This  does  not  mean  that  it  is  now  any  easier  to  discover  a  proof  for  a 
difficult  theorem.  This  still  requires  the  same  high  order  of  mathematical 
talent  as  before.  However,  once  the  proof  is  discovered,  and  stated  in 
symbolic  logic,  it  can  be  checked  by  a  moron. 

This  complete  lack  of  any  reference  to  the  meanings  of  statements  in 
symbolic  logic  indicates  that  there  is  no  need  for  them  to  have  meanings. 
This  allows  us  to  introduce  formulas  whenever  they  are  useful  without 
reference  to  whether  they  are  meaningful.  In  fact,  there  is  a  type  of  for- 
mula about  whose  meaning  (if  any)  there  is  great  disagreement.  It  happens 
to  be  a  useful  type  of  formula,  and  we  use  it  frequently,  not  being  the  least 
bit  inconvenienced  by  its  possible  lack  of  meaning  (see  Chapter  VIII). 

This  lack  of  reference  to  meanings  also  enables  us  to  evade  quite  a  num- 
ber of  difficult  philosophical  questions.  This  situation  is  quite  in  line  with 
current  mathematical  practice.  Consider  the  positive  integers,  which  are 
at  the  basis  of  most  of  mathematics.  Mathematicians  do  not  care  in  the 
least  what  the  meanings  of  the  positive  integers  are,  or  even  if  they  have 
meanings.  For  the  mathematician,  it  suffices  to  know  what  operations  he  is 
permitted  to  perform  on  the  positive  integers.  Once  this  information  is 
available,  any  information  as  to  the  meanings  of  the  integers  is  wholly 
irrelevant  for  mathematical  purposes.  The  same  applies  to  real  numbers, 
imaginary  numbers,  functions,  or  any  other  of  the  paraphernalia  of 
mathematics. 

The  matter  was  well  expressed  by  Lewis  Carroll,  long-time  mathematical 
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lecturer  of  Christ  Church,  who  upon  being  asked  to  contribute  to  a  philo- 
sophical symposium  responded : 

''And  what  mean  all  these  mysteries  to  me 
Whose  life  is  full  of  indices  and  surds? 
x^  -\-7x  -\-  53 

3 

We  shall  not  make  any  use  of  the  familiar  term  "proposition."  This  is 
because  the  word  "proposition"  refers  to  the  meanings  of  statements, 
and  we  intend  to  ignore  the  meanings  (if  any)  of  our  statements.  However, 
we  shall  here  say  a  word  about  propositions  and  the  problems  connected 
with  them  just  to  show  how  useful  it  is  not  to  have  to  consider  these 
problems. 

A  proposition  is  the  meaning  of  a  statement,  and  one  says  that  the 
statement  expresses  the  proposition.  One  difficulty  that  arises  immediately 
is  that  of  deciding  when  two  different  statements  express  the  same  propo- 
sition. Sometimes  it  is  easy.  Thus  "three  is  a  prime"  and  "Drei  ist  eine 
Primzahl"  certainly  express  the  same  proposition.  However,  what  about 
"Three  is  a  prime"  and  "Three  is  greater  than  unity  and  is  not  divisible 
by  any  positive  integers  except  itself  and  unity"?  Do  they  express  the 
same  proposition  or  equivalent  propositions? 

Any  attempt  to  be  precise  and  pay  attention  to  meanings  would  involve 
us  with  such  problems  as  the  above,  which  are  really  quite  irrelevant  for 
mathematics.  For  mathematics,  it  is  the  form  that  must  be  considered, 
and  the  meaning  can  be  dispensed  with.  Our  symbolic  logic  will  accord 
with  this  doctrine. 

Actually,  although  one  carefully  builds  the  symbolic  logic  so  that  it  can 
be  used  without  reference  to  meaning,  this  does  not  mean  that  we  can  ignore 
meaning  in  devising  our  logic.  We  recall  that  our  symbolic  logic  is  intended 
to  give  a  precise  definition  of  an  intuitive  notion  of  logical  correctness.  So 
the  mechanical  operations  of  our  symbolic  logic,  though  devoid  of  meaning, 
must  nevertheless  manage  to  parallel  closely  the  intuitive  thought  processes 
based  on  meaning.  Clearly,  then,  careful  attention  is  paid  to  meaning  and 
intuitive  thought  processes  in  inventing  the  symbolic  logic. 

Now  that  the  symbolic  logic  has  been  invented,  we  could  present  it  to 
the  reader  merely  as  a  mechanical  system,  without  reference  to  the  motiva- 
tion which  underlies  it.  Certainly  it  is  intended  to  be  used  in  this  way. 
Nonetheless,  the  reader  will  find  it  easier  to  learn,  remember,  and  use  the 
symbolic  logic  if  we  explain  to  him  the  underljdng  thought  processes. 
Consequently  much  of  our  discussion  in  the  earlier  chapters  will  be  quite 
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intuitive  in  character  and  not  particularly  precise.  Gradually,  as  our 
symbolic  logic  crystallizes  out  of  the  intuitive  background,  we  shall  become 
more  precise,  though  we  shall  never  lose  sight  of  our  intuitive  background 
completely  even  after  we  have  finally  completely  defined  our  symbolic 
logic  and  are  proceeding  quite  mechanically. 

4.  Advantages  and  Disadvantages  of  a  Symbolic  Logic.  We  have 
alread}^  mentioned  some  advantages  of  a  symbolic  logic  over  a  simple 
intuitive  notion  of  logical  correctness,  namely,  its  greater  precision  and  its 
lack  of  reference  to  meanings;  because  of  the  lack  of  reference  to  meanings, 
many  difficult  philosophical  problems  can  be  evaded  and  mechanical 
checks  of  proofs  are  possible. 

A  symbolic  logic  is  a  formal  system  and  as  such  has  the  advantage  of 
objectivity  which  is  inherent  in  any  formal  system.  This  can  be  illustrated 
by  a  reference  to  the  origins  of  geometry.  To  the  Egyptians,  a  straight 
line  was  a  stretched  string.  Now  two  stretched  strings  are  much  alike, 
but  not  completely  so,  and  thus  one  person's  idea  of  a  straight  line  would 
not  coincide  exactly  with  another  person's  idea.  As  an  extreme  instance, 
one  man  ma}^  be  dealing  with  a  fine  silk  cord,  and  the  second  man  with  a 
towrope.  In  this  case,  their  "straight  lines"  would  be  quite  appreciably 
different.  Then  came  the  Greeks,  who  replaced  the  stretched  string  by  an 
abstract  idea  of  a  straight  line  which  was  defined  by  purely  formal  axioms. 
From  that  time  on,  the  straight  line  has  meant  the  same  thing  to  all  who 
accepted  the  Greeks'  definition.  Analogously,  by  means  of  symbolic  logic 
we  replace  a  person's  intuitive  ideas,  subjectively  conceived  and  full  of 
personal  psychological  overtones,  by  abstract  formal  ideas  which  can  be 
the  same  for  all  persons. 

A  sj^mbolic  logic  uses  symbols  and  so  has  the  advantages  arising  from  the 
use  of  symbols,  in  particular,  greater  ease  in  handling  complex  manipula- 
tions. This  is  so  familiar  to  mathematicians  that  an  instance  is  probably 
unnecessary.  We  cite  one  anyhow  for  completeness.  Consider  the  simple 
problem:  "Mary  is  now  three  times  as  old  as  Jane.  In  ten  years  Mary  will 
be  twice  as  old  as  Jane.  How  old  are  Mary  and  Jane  now?"  Algebraically 
this  is  almost  trivial.  We  take  the  symbols  M  and  J  to  stand  for  the  present 
ages  of  Mary  and  Jane,  getting 

M  =  3J, 

ilf  +  10  =  2(J  +  10). 

Subtraction  gives  J  =  10,  whence  we  get  M  =  30.  The  point  is  that, 
though  this  is  very  simple  when  handled  by  symbols,  it  is  not  particularly 
easy  if  one  tries  to  handle  it  intuitively.  Certainly  one  can  get  the  answer 
by  words  alone,  but  it  is  so  awkward  to  do  so  that  variants  of  the  above 
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problem  are  actually  given  as  simple  puzzles  for  those  not  accustomed  to 
the  use  of  algebraic  technique. 

It  is  interesting  to  note  that  the  algebraic  procedure  outlined  above  does 
not  differ  greatly  from  the  intuitive  procedure  that  one  might  use  to  solve 
the  problem  verbally.  In  other  words,  use  of  symbolic  manipulations  does 
not  necessarily  give  one  any  technique  for  solving  the  problem  which  was 
not  already  present  in  the  intuitive  case;  it  merely  makes  the  existing 
techniques  more  flexible,  more  effective,  and  more  apparent.  This  is 
characteristic  of  the  use  of  symbols. 

When  one  gives  a  precise  definition  of  a  concept,  then  there  arises  the 
possibility  of  generalizing  or  varying  the  concept  by  slight  alterations  in 
the  definition.  Thus,  as  long  as  the  early  Egyptians  were  thinking  of 
parallel  lines  as  wagon  tracks,  there  was  no  possibility  of  getting  non-Euclid- 
ean geometries  in  which  parallel  lines  behave  in  quite  unfamiliar  fashions. 
However,  after  the  Greeks  had  defined  parallel  lines  as  straight  lines  which 
never  meet,  and  Euclid  had  defined  geometry  (we  call  it  Euclidean  geome- 
try nowadays)  by  specifying  that  parallel  lines  should  behave  essentially 
like  wagon  tracks,  then  one  could  generalize  to  non-Euclidean  geometries 
by  specifying  other  behaviors  for  parallel  lines.  Similarly,  by  going  from 
the  simple  intuitive  concept  of  continuity  to  the  precise  e-8  definition,  one> 
can  then  introduce  many  variations  of  continuity,  such  as  absolute  con- 
tinuity, semicontinuity,  and  upper  and  lower  semicontinuity. 

An  analogous  situation  has  already  arisen  in  connection  with  symbolic 
logic.  There  are  now  several  different  systems  of  symbolic  logic  available 
which  differ  in  various  details.  We  have  chosen  that  one  which  seems  to  us 
most  nearly  in  accord  with  the  intuitive  notion  of  logical  correctness  as 
conceived  by  most  mathematicians. 

Thus  our  choice  of  a  system  of  symbolic  logic  is  arbitrary.  This  is  a 
disadvantage  in  that  later  study  may  show  our  choice  to  have  been  a  poor 
one.  It  is  also  an  advantage  in  that  if  we  ever  become  dissatisfied  with  our 
choice,  we  can  readily  change  it. 

The  main  disadvantage  of  a  system  of  symbolic  logic  is  that  it  is  a  formal 
system  divorced  from  intuition.  Intuition  arises  from  experience,  and  so 
may  be  expected  to  have  some  foundation  in  fact.  However,  a  formal 
system  is  merely  a  model  devised  by  human  minds  to  represent  some  facts 
perceived  intuitively.  As  such,  it  is  bound  to  be  artificial.  In  some  cases, 
the  artificiality  is  quite  clear.  Thus  electrical  engineers  are  taught  a  system 
for  computing  currents  and  voltages  in  rotating  electrical  machinery  by 
representing  them  as  complex  numbers  of  the  form  a  -\-  hi,  i  —  \/—l  (see 
Glasgow,  1936).  As  there  is  nothing  imaginary  about  the  currents  and 
voltages,  this  is  clearly  an  artificial  representation.  Nevertheless,  its 
advantages  outweigh  its  obvious  artificiality. 
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Probably  the  only  time  that  the  artificiality  of  a  formal  system  does  any 
harm  is  when  the  users  of  the  system  ignore  or  overlook  the  fact  that  it  is 
artificial.  Thus,  for  two  thousand  years  it  was  supposed  that  the  physical 
universe  was  actually  a  Euclidean  three-dimensional  space.  This  inhibited 
men's  thinking  tremendously  and  was  a  great  misfortune.  Nowadays, 
astronomical  measurements  have  made  it  seem  quite  likely  that  the  uni- 
verse is  non-Euclidean.  Though  this  demonstrates  the  artificiality  of 
Euclidean  geometry,  nonetheless  Euclidean  geometry  is  still  extremely 
useful,  as  useful  in  fact  as  it  ever  was.  Thus  artificiality  is  not  a  serious 
disadvantage  if  one  does  not  lose  sight  of  the  artificiality. 

From  the  point  of  view  of  the  nonmathematician,  who  finds  it  difficult  to 
work  with  symbols,  use  of  symbols  is  a  disadvantage.  We  intend  the 
present  text  for  mathematicians,  to  whom  the  use  of  symbols  is  quite 
congenial,  and  so  make  no  apology  for  the  use  of  symbols. 

We  mentioned  the  possibility  of  mechanical  checking  of  proofs  as  an 
advantage.  It  is  not  wholly  an  advantage.  If  a  person  has  little  clerical 
skill,  he  is  liable  to  make  mistakes  in  his  mechanical  checking,  and  so  find 
it  of  little  value.  On  the  other  hand,  if  one  relies  exclusively  on  intuition, 
there  is  danger  of  overlooking  some  detail  which  appears  insignificant  but 
isn't.  The  truth  is  that  the  average  person  cannot  rely  exclusively  on 
either  intuition  or  mechanical  checking.  For  the  average  person,  mechan- 
ical checking  is  a  valuable  adjunct  to  intuition,  but  in  doing  the  mechanical 
checking  he  must  continually  refer  back  to  his  intuition  to  catch  clerical 
errors. 

We  summarize  the  above  points.  Although  we  think  that  the  average 
mathematician  will  find  that  a  study  of  symbolic  logic  is  very  helpful  in 
carrying  out  mathematical  reasoning,  we  do  not  recommend  that  he  should 
completely  abandon  his  intuitive  methods  of  reasoning  for  exclusively 
formal  methods.  Rather,  he  should  consider  the  formal  methods  as  a  sup- 
plement to  his  intuitive  methods  to  provide  mechanical  checks  of  critical 
points,  and  to  provide  the  assistance  of  symbolic  operations  in  complex 
situations,  and  to  increase  his  precision  and  generality.  He  should  not 
forget  that  his  intuition  is  the  final  authority,  so  that,  in  case  of  an  irrecon- 
cilable conflict  between  his  intuition  and  some  system  of  symbolic  logic,  he 
should  abandon  the  symbolic  logic.  He  can  try  other  systems  of  symbolic 
logic,  and  perhaps  find  one  more  to  his  liking,  but  it  would  be  difficult  to 
change  his  intuition. 


CHAPTER  II 
THE  STATEMENT  CALCULUS 

1.  Statement  Functions.  As  indicated  in  the  previous  chapter,  we  shall 
not  proceed  at  once  to  a  precise  definition  of  a  statement.  We  have  told 
the  reader  that  essentially  a  statement  is  a  declarative  sentence  (not 
necessarily  true)  which  deals  exclusively  with  mathematical  and  logical 
matters.  We  shall  gradually  make  this  idea  precise,  but  in  the  present 
chapter  we  shall  confine  our  attention  to  certain  of  the  verj^  simplest  ways 
of  building  statements,  by  use  of  the  so-called  ''statement  functions." 

We  derive  all  the  statement  functions  from  two  basic  ones,  "&"  and  "'-^". 
Consider  two  statements,  "P"  and  "Q",  of  symbolic  logic  which  are  trans- 
lations of  the  English  sentences  "A"  and  "B".  Then  "(P&QY'  is  the 
statement  which  is  a  translation  of  ''A  and  5"  and  "'^P"  is  the  statement 
which  is  a  translation  of  the  negation  of  the  sentence  "A".  If  "A"  hap- 
pens to  be  a  simple  sentence,  the  negation  would  most  usually  be  formed  by 
inserting  a  "not"  into  "A"  at  the  grammatically  proper  place.  Thus  "&" 
is  the  translation  of  "and",  and,  allowing  for  the  difference  of  sentence 
structure,  "'-^"  is  the  translation  of  "not".  Hence  we  usually  refer  to 
"&"  and  "~"  as  "and"  and  "not",  and  usually  read  "(P&Q)"  and  "~P" 
as  "P  and  Q"  and  "not  P",  respectively.  However,  when  we  wish  to  be 
very  careful  we  refer  to  "&"  and  "'^"  by  their  correct  names  "ampersand" 
and  "curl"  or  "twiddle". 

To  illustrate,  let  "P"  and  "Q"  be  translations  of  "It  is  raining  now"  and 
"It  is  not  cloudy  now".  Then  "(P&Q)"  is  a  translation  of  "It  is  raining 
now  and  it  is  not  cloudy  now",  and  "^^P"  and  "'^Q"  are  translations  of 
"It  is  not  raining  now"  and  "It  is  cloudy  now".  Finally  "'^(P&Q)"  is  a 
translation  of  "Either  it  is  not  raining  now  or  else  it  is  cloudy  now  or  else 
it  is  both  cloudy  and  not  raining  now"  or  of  "It  is  not  now  both  raining  and 
not  cloudy"  or  of  some  such  negation  of  "It  is  raining  now  and  it  is  not 
cloudy  now". 

"(P&Q)"  has  properties  analogous  to  the  product  of  two  numbers  in 
arithmetic  or  algebra.  For  this  reason,  it  is  called  the  logical  product  of 
"P"  and  "Q",  which  are  called  the  factors,  and  is  often  written  "(P.Q)"  or 
simply  "(PQ)".  Also,  one  omits  the  parentheses  whenever  possible  without 
ambiguity,  so  that  it  may  also  be  written  "P&Q",  "P.Q",  or  "PQ".  To 
diminish  the  number  of  possible  cases  of  ambiguity,  we  agree  that  whenever 
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a  "/^"  occurs  it  shall  affect  as  little  as  possible  of  what  follows  it.  This  is 
expressed  as  follows. 

Convention.  Any  given  occurrence  of  "'^"  shall  have  as  small  a  scope 
as  possible. 

As  an  illustration,  consider  "^PQ"  (or  either  of  the  alternative  forms 
"'^P.Q"  or  "'^P&Q").  According  to  our  convention,  the  "^^"  affects 
"P"  but  not  "Q",  and  so  we  understand  "'^PQ"  to  mean  "(^P)Q"  (or 
"(~P).Q"  or  "{^P)&Q").  Without  our  convention,  there  would  be  the 
possibihty  that  "^PQ"  might  mean  "~(PQ)". 

The  expression  "P'-^Q"  is  unambiguous  even  without  our  convention, 
since  it  clearly  can  mean  nothing  but  "P&('^Q)". 

By  means  of  "&"  and  "'~",  we  can  translate  many  other  English  con- 
junctions besides  ''and"  into  symbolic  logic.  As  before,  let  "P"  and  "Q" 
be  statements  of  symbolic  logic  which  are  translations  of  the  English  sen- 
tences "A"  and  "5",  and  let  us  seek  to  find  a  translation  for  "Either  A  or 
B".  First  we  should  agree  whether  we  interpret  "Either  A  or  B"  in  the 
exclusive  sense  of  "Either  ^  or  -B  but  not  both"  or  in  the  inclusive  sense  of 
"Either  A  or  B  or  both".  According  to  each  of  the  four  best  unabridged 
dictionaries,  the  exclusive  use  is  the  only  correct  use,  and  the  inclusive  use 
has  no  justification  at  all  in  correct  English.  Nonetheless,  in  mathematics 
the  inclusive  form  of  "or"  is  very  commonly  used,  and  in  everyday  lan- 
guage, it  is  often  not  clear  which  is  intended.  In  legal  documents  one 
commonly  finds  the  inclusive  "or"  expressed  as  "A  and/or  B". 

In  some  languages,  there  are  different  words  for  the  exclusive  and  in- 
clusive "or".  Thus  in  Latin,  the  word  "aut"  denotes  an  exclusive  "or"  so 
that  "aut"  means  "or — but  not  both",  whereas  "vel"  denotes  an  inclusive 
"or"  so  that  "vel"  means  "and/or".  We  shall  translate  both  the  exclusive 
"or"  and  the  inclusive  "or"  into  symbolic  logic. 

We  take  first  the  inclusive  "or".  We  are  seeking  the  interpretation  of 
"A  and/or  B"  or  the  equivalent  "Either  A  or  5  or  both".  This  statement 
is  equivalent  to  denying  that  "A"  and  "B"  are  simultaneously  false,  and 
we  use  this  fact  to  carry  out  the  translation.  That  "^"  is  false  would  be 
translated  as  "-^P",  and  that  "5"  is  false  would  be  translated  as  "-^Q". 
That  both  are  false  would  be  translated  as  "('^P)&('^Q)",  which  we  can 
simplify  unambiguously  to  "'^P'^Q".  The  denial  of  this  is  then  trans- 
lated as  "~(~P~Q)".  So  we  conclude  that  the  translation  of  "Either  A 
or  B  or  both"  is  "~(~P~Q)". 

To  translate  "Either  A  or  P  but  not  both",  it  suffices  to  adjoin  the  addi- 
tional statement  "not  both",  which  certainly  would  be  translated  "-^(PQ)". 
So  the  translation  of  "Either  A  or  B  but  not  both"  is  "(~(~P~Q))& 
('^(PQ))",  which  we  can  simplify  unambiguously  to  "'--^('~P'^Q)&~(PQ) " 
or  "~(~P~Q)~(PQ)". 
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Although  the  exchisive  "or"  is  the  grammatically  correct  one,  being 
sanctioned  by  the  leading  unabridged  dictionaries,  nevertheless,  the  in- 
clusive "or"  is  much  more  commonly  used  in  mathematics.  In  fact,  it  is 
used  so  often  that  it  becomes  burdensome  to  write  the  three  "'~'s"  and 
the  two  parentheses  involved  in  "'^('^P'^Q)",  so  that  we  adopt  a  short- 
hand, and  commonly  write  "(PvQ)"  in  place  of  "'^('^P^^Q)".  Whenever 
no  ambiguity  will  result,  we  simplify  "(PvQ)"  to  "PyQ". 

One  can  think  of  the  "v"  of  "PyQ"  as  denoting  the  Latin  word  "vel". 
We  shall  commonly  read  "v"  as  "or". 

Strictly  speaking,  the  notation  "PvQ"  is  not  part  of  symbolic  logic,  and 
if  one  were  being  really  careful  one  would  always  write  "'^{'^P^^Q)". 
However,  it  is  very  handy  to  write  "PwQ"  instead,  and  to  agree  that 
whenever  "PyQ"  appears  it  is  really  "'■^('^P'^Q)"  which  occurs.  Similar 
considerations  apply  to  our  omission  of  parentheses.  Strictly  speaking,  it  is 
not  permissible  to  omit  parentheses,  and  we  omit  them  only  with  the  under- 
standing that  in  any  precise  formulation  they  would  actually  be  present. 

To  put  it  another  way,  omission  of  parentheses  and  replacement  of 
"'^('^P&^Q)"  by  "PvQ"  are  not  part  of  our  symbolic  logic,  but  only  a 
convenience  which  we  permit  ourselves  in  talking  about  it. 

Our  convention  about  the  scope  of  "'~"  requires  that  "'^PwQ"  should' 
mean  "('^P)vQ"  rather  than  "'^(PvQ)".  To  remove  the  ambiguity  in 
"PQyR"  and  "PyQR",  we  make  use  of  an  algebraic  analogy.  For  this,  we 
call  "PyQ"  the  logical  sum  of  "P"  and  "Q",  which  are  called  the  summands. 
Then  the  algebraic  formula  analogous  to  "PQyR"  would  be  "xy  -\-  z" , 
since  "PQ"  is  the  logical  product  of  "P"  and  "Q".  As  we  always  interpret 
"xy  -\-  z"  to  mean  "{xy)  +  z"  and  never  "x{y  -\-  z)",  we  shall  correspond- 
ingly interpret  "PQyR"  as  "(PQ)yR"  rather  than  "PiQyR)".  By  the 
algebraic  analogy,  we  likewise  interpret  "PyQR"  as  "Py{QR)". 

Another  compound  sentence  which  occurs  in  mathematical  reasoning 
with  great  frequency  is  "If  A,  then  B."  Many  variants  are  in  common 
use,  the  most  common  of  which  are: 

"B  is  a  necessary  condition  for  A." 

"A  is  a  sufficient  condition  for  B." 

"B  if  A." 

"A  only  if  B." 

"A  implies  B." 

Some  logicians  insist  that  the  terminology  "A  implies  5"  be  reserved  for 
quite  a  different  purpose.  This  is  not  standard  mathematical  practice, 
and  we  follow  the  mathematical  practice  which  takes  "A  implies  B"  and 
"If  A,  then  B"  as  interchangeable. 

We  seek  a  translation  for  "If  A,  then  B."  We  note  that,  whenever  "If  A, 
then  B"  is  valid,  then  we  cannot  have  both  "A"  true  and  "B"  false.    Con- 
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versely,  whenever  we  cannot  have  both  "A"  true  and  "5"  false,  we  can 
surely  conclude  that  "If  A,  then  B"  (since  if  not  "5",  then  we  would  have 
"A"  true  and  "5"  false).  So  we  can  translate  "If  A,  then  E"  by  translating 
"We  cannot  have  both  'A'  true  and  'B'  false."  Clearly  we  translate 
".  .  .  both  'A'  true  and  '5'  false"  by  "P^Q".  So  we  translate  "We  cannot 
have  both  'A'  true  and  'B'  false"  by  "~(P~Q)". 

Conclusion.     We  translate  "If  A,  then  B"  into  ''^(P~Q)". 

The  shorthand  for  this  is  "(P  D  Q)"  or  "P  D  Q".  We  read  this  as 
"P  implies  Q".  We  refer  to  the  statement  "P  D  Q"  as  an  implication,  or 
conditional,  and  "P"  as  the  hypothesis  and  "Q"  as  the  conclusion. 

There  is  no  very  good  analogue  for  "P  D  Q"  in  algebra.  Perhaps  the  best 
is  "x  >  y".  Poor  though  this  analogy  is,  we  use  "xy  >  z"  and  "x  -\-  y  >  z" 
as  an  analogy  for  interpreting  "PQ  D  P"  and  "PvQ  3  P"  as  "(PQ)  D  P" 
and  "(PvQ)  D  P",  respectively.    Similarly  for  "P  D  QP"  and  "P  D  QsR'\ 

Still  another  compound  sentence  of  frequent  occurrence  is  "A  if  and  only 
if  B,"  or  its  variant  "A  is  a  necessary  and  sufficient  condition  for  -B." 
Obviously  this  is  to  be  translated  as  "(P  D  Q){Q  D  P)". 

The  shorthand  for  this  is  "(P  =  Q)"  or  "P  =  Q".  We  read  this  as 
"P  is  equivalent  to  Q"  or  just  "P  equivalent  Q".  We  refer  to  the  state- 
ment "P  =  Q"  as  an  equivalence,  or  biconditional,  and  "P"  as  the  left 
side  and  "Q"  as  the  right  side. 

This  may  be  likened  to  "x  =  y"  in  algebra.  Accordingly,  by  analogy  with 
"xy  =  z"  and  "x  -{-  y  =  z",  we  interpret  "PQ  =  P"  and  "PvQ  =  P"  as 
"(PQ)  =  P"  and  "(PvQ)  =  P",  respectively.  Similarly  for  "P  =  QP" 
and  "P  =  QyR". 

In  algebra,  "(xy)z'\  "x{yz)",  and  "x^/s"  all  refer  to  the  product  of  the 
three  numbers  "x",  "?/",  and  "z",  and  it  is  not  usual  to  make  any  distinc- 
tion between  them.  In  deriving  algebra  from  a  set  of  axioms,  it  is  sometimes 
necessary  to  pretend  that  there  might  be  a  distinction  between  "(xy)z''  and 
"x(yzy',  but  this  is  done  only  for  the  purpose  of  allowing  one  to  state  and 
prove  the  theorem  that  no  distinction  need  be  made.  The  situation  is 
quite  analogous  with  the  logical  product.  It  is  usually  permissible  to 
interpret  "PQR"  as  either  "(PQ)R"  or  "PiQR)"  without  prejudice.  In 
the  few  cases  where  it  is  not,  it  is  the  custom  to  "associate  to  the  left." 
That  is,  "PQR"  is  considered  to  be  "(PQ)R",  "PQRS"  is  considered  to  be 
"((PQ)R)S",  etc. 

In  dealing  with  sums,  whether  algebraic  or  logical,  exactly  the  same 
situation  occurs.  In  algebra,  we  seldom  need  to  distinguish  between 
"(x  -\-  y)  +  2",  "x  -\-  (y  -\-  z)",  and  "x  +  ?/  +  2".  In  logic,  we  seldom 
need  to  distinguish  which  of  "(PvQ)vP"  and  "Pv(QvP)"  is  meant  by 
"PwQwR",  and  when  we  do,  we  associate  to  the  left,  interpreting  "PyQyR" 
as  "(PvQ)vP". 


16  LOGIC  FOR  MATHEMATICIANS  [Chap.  II 

When  we  come  to  such  statements  as  "P  D  Q  D  R"  the  algebraic  analogy- 
conflicts  with  the  rule  of  association  to  the  left.  In  algebra  "x  >  y  >  z" 
means  "x  >  y  and  y  >  z",  so  that  we  should  expect  to  interpret  "P  D  Q  D 
R"  as  "(P  D  Q)&(Q  D  R)".  On  the  other  hand,  association  to  the  left 
would  give  "(P  D  Q)  3  R",  which  is  quite  a  different  statement.  Under 
the  circumstances  it  might  be  just  as  well  to  consider  "P  D  Q  D  R"  as 
ambiguous,  and  to  write  explicitly  whichever  of  "(P  D  Q)  D  R",  "P  D 
(Q  D  R)",  or  "(P  D  Q)&{Q  ^  R)"  is  intended. 

In  the  case  of  "P  =  Q  =  P",  the  algebraic  analogy  again  conflicts  with 
the  rule  of  association  to  the  left.  However,  association  to  the  left  yields 
"(P  =  Q)  =  R'\  which  is  of  limited  usefulness,  whereas,  since  "x  =  y  =  2" 
means  "x  =  y  and  y  =  2",  the  algebraic  analogy  yields  "(P  =  Q)&(Q  =  P)", 
which  is  quite  useful.  Accordingly,  in  this  case  we  follow  the  algebraic 
analogy  and  interpret  "P  =  Q  ^  P"  as  "(P  =  Q)&{Q  ^  P)",  "P  =  Q  = 
R  =  S"  as  "{P  =  Q)&(Q  =  R)&(R  =  Sy\  etc. 

The  forms  which  might  be  shortened  to  "P  D  Q  =  R"  or  "P  =  Q  D  R" 
are  of  infrequent  occurrence,  so  that  it  is  not  worth  while  to  decide  on  an 
interpretation  for  "P  D  Q  =  R"  or  "P  =  Q  D  P".  Instead,  we  must 
leave  in  enough  parentheses  to  resolve  any  ambiguity. 

The  rule  about  the  scope  of  "'~"  covers  any  interpretations  involving 

One  can  define  many  other  connectives,  but  the  ones  we  have  listed 
appear  to  be  the  ones  which  occur  with  suflficient  frequency  to  make  a 
special  notation  for  them  worth  while. 

The  form  "^^P"  defines  a  statement  function  or  function  of  the  state- 
ment variable  "P"  in  the  sense  that,  if  one  puts  a  statement  in  for  "P", 
one  gets  another  statement  "'^P".  This  is  quite  analogous  to  the  way  in 
which  one  defines  a  function  of  a  real  number.  Thus  "  —  x"  defines  a  func- 
tion of  the  real  number  "x"  in  the  sense  that,  if  one  puts  a  real  number  in 
for  "x",  one  gets  another  real  number  "  —  x'\  Likewise  'TP"  defines 
another  statement  function,  which  can  be  considered  the  analogue  of  the 
real  function  defined  by  ".-r^". 

One  can  also  have  statement  functions  of  more  than  one  variable.  Thus 
"PQ"  and  "PyQR^'  define  statement  functions  of  two  and  three  variables 
which  are  respectively  analogous  to  the  real  functions  defined  by  "xy"  and 

"x  +  yz". 

We  should  warn  the  reader  of  one  place  where  "if"  means  "if  and  only 
if"  and  should  be  translated  as  "  =  ".  This  is  in  definitions.  Thus  on  page 
18  of  Fort,  1930,  we  find:' 

"Definition  9.  The  infinite  series,  ao,  fli,  ag,  .  .  .  ,  is  said  to  converge  if  s„ 
approaches  a  limit  when  n  becomes  infinite." 

1  From  "Infinite  Series"  by  Tomlinson  Fort,  copyright  1930  by  Clarendon  Press, 
courtesy  of  Oxford  University  Press. 
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Here  it  is  certain  that  Fort  intends  for  "if"  to  be  interpreted  as  "if  and 
only  if."  However,  in  using  "if"  in  this  sense,  Fort  is  only  following 
standard  usage,  which  decrees  that  an  "if"  used  in  this  fashion  in  a  defini- 
tion invariably  means  "if  and  only  if." 

Occasionally,  one  finds  a  less  ambiguous  wording  in  a  definition.  Thus, 
Fort  on  page  19  states:^ 

''Definition  11.      ^   a„  is  said  to  converge  when  and  only  when  ^  a„ 

and   ^  a„  both  converge  .  .  .  ." 

n=-l 

However,  in  a  large  majority  of  cases,  one  will  merely  find  "if"  used  in 
the  sense  of  "if  and  only  if"  in  definitions,  and  should  translate  such  defini- 
tions into  symbolic  logic  by  use  of  "  =  ". 

EXERCISES 

11.1.1.  Write  two  short  essays  (not  more  than  five  sentences  apiece) 
concerning  the  use  of  "and"  and  "but"  as  conjunctions  between  complete 
statements,  telling: 

(a)  The  logical  difference  between  "and"  and  "but". 

(b)  The  psychological  difference  between  "and"  and  "but". 

11.1.2.  If  "P"  and  "Q"  are  translations  for  "x^  >  0"  and  "x  >  0", 
write  a  translation  for  "x^  >  0  whenever  x  >  0". 

IL1.3.  If  "P",  "Q",  and  "R"  are  translations  for  "x  =  y",  "x/z  =  y/z", 
and  "z  =  0",  write  a  translation  for  "If  x  =  y,  then  x/z  —  y/z  except  when 
z  =  0". 

II.1.4.  If  "P"  and  "Q"  are  translations  for  "(n  -  1)!  +  1  is  divisible 
by  n"  and  "n  is  a  prime",  write  a  translation  for  "(n  —  1)!  -(-  1  is  not 
divisible  by  ?i  unless  n  is  a  prime". 

n.1.5.     Show  that  "~(P  -  Q)"  is  the  same  statement  as  "P-^QvQ~P". 

11.1.6.  If  "P"  and  "Q"  are  known  to  be  true,  what  can  one  say  about 
"PQ"?  If  "PQ"  is  known  to  be  true,  what  can  one  say  about  "P"  and 
"Q"?  If  "P  D  Q"  and  "Q  D  P"  are  known  to  be  true,  what  can  one  say 
about  "P  =  Q"? 

11.1.7.  If  "P"  is  known  to  be  true,  what  can  one  say  about  "^^^^P"? 

11.1.8.  On  the  basis  that  "R^^R"  is  a  contradiction,  show  that 
"^P^QiPyQy  and  "P^QiP  D  Q)"  are  contradictions. 

11.1.9.  If  we  are  told  that  "R  D  ~~P"  is  true  for  every  statement 
"P",  what  can  we  infer  about  "-'P-'Q  D  ~(PvQ)",  "P~Q  3  ~(P  3  Q)", 
and  "(P  ^  Q)  D  -'(P-QvQ-P)"? 

11.1.10.  Show  that  ''^PyP''  is  the  same  statement  as  "~~P  D  P". 

11.1.11.  If  "P",  "Q",  "R",  and  "*S"  are  translations  for  "p  is  a  prime", 
"n  is  an  integer  which  divides  p",  "n  =  1",  and  "w  =  p",  ^\Tite  a  translation 

1  Ibid. 
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for  "A  necessary  condition  for  p  to  be  a  prime  is  that,  if  n  is  an  integer 
which  divides  p,  then  n  =  1  orn  =  p". 

11.1.12.  If  "P"  and  "Q"  are  translations  of  "C  is  on  the  perpendicular 
bisector  of  the  line  joining  A  and  B"  and  "C  is  equidistant  from  A  and  B", 
write  a  translation  of  "C  is  equidistant  from  A  and  5  if  and  only  if  it  is  on 
the  perpendicular  bisector  of  the  line  joining  them". 

11.1.13.  If  "P",  "Q",  and  "R"  are  translations  of  "a  <  h  +  c", 
"c  >  0",  and  "a  <  V\  write  a  translation  of  "If  a  <  h  -{-  c  whenever 
c>  0,  then  a  <  6". 

11.1.14.  If  "P"  and  ''Q"  are  translations  of  'Triangle  ABC  has  two 
angles  equal"  and  "Triangle  ABC  is  isosceles",  write  a  translation  of 
"A  triangle  with  two  angles  equal  is  isosceles". 

11.1.15.  If  "P",  "Q",  "P",  and  "*S"  are  translations  of  "Triangle  ABC 
is  congruent  to  triangle  DEF'\  "AB  =  DE",  "BC  =  EF",  and 
"CA  =  FD",  write  a  translation  of  "Two  triangles  are  congruent  if  the 
three  sides  of  the  one  are  equal  respectively  to  the  three  sides  of  the  other". 

n.1.16.  If  "P",  "Q",  and  "P"  are  translations  of  "/(n)  is  a  polynomial 
with  integral  coefficients",  "/(n)  is  a  constant",  and  "/(n)  is  a  prime  for 
all  n",  write  a  translation  of  "No  polynomial  f{n)  with  integral  coefficients, 
other  than  a  constant,  can  be  prime  for  all  n". 

II.1.17.  If  "P",  "Q",  "P",  "aS",  and  "T"  are  translations  of  "p  and  g 
are  distinct  odd  primes",  "p  is  of  the  form  4n  +  3",  "q  is  of  the  form 
4w  +  3", 

r- 

and 

write  a  translation  of  "If  p  and  q  are  distinct  odd  primes,  then 

q/        \p 
unless  both  p  and  q  are  of  the  form  4?i  +  3,  in  which  case 

II.1.18.  If  "P",  "Q",  "P",  and  "*S"  are  translations  of  "AB  is  a  chord 
of  circle  0",  "CD  is  a  chord  of  circle  0",  "AB  =  CD",  and  "^P  and  CD  are 
equidistant  from  the  center  of  circle  0",  write  a  translation  of  "In  the  same 
circle,  if  two  chords  are  unequal,  they  are  unequally  distant  from  the 
center". 


1  f      ^.^ 

positive  function  of  x",  "  Y)  77^  is  convergent",  and  "  /     -i~  is  con- 

„=i  (p{n)  Ji 
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11.1.19.  If  "P"  and  "Q"  are  translations  of  "H  (1  +  ^m)  converges" 

n  m-1 

and  "Asn^a>,    ]~I    (1  +  flm)  tends  to  a  limit  other  than  zero",  write  a 
translation  of  "We  say  that   FI    (1  +  dm)  converges  if,  as  n  — >oo,    ]~J 

m=l  m-1 

(1  +  a,„)  tends  to  a  limit  other  than  zero". 

11.1.20.  If  "P",  ''Q",  "R",  and  "^"  are  translations  of  "f(x)  =  g{x)h(x)", 
"r  is  a  root  of  fix)",  "r  is  a  root  of  g(x)",  and  "r  is  a  root  of  /i(a:)",  ^vrite  a 
translation  of  ''If  f(x)  ^  g(x)h(x),  then  any  root  of  f(x)  is  a  root  of  g{x) 
or  /i(a:)". 

11.1.21.  If  "P",  "Q",  and  "72"  are  translations  of  "(j>(x)  is  an  increasing 

dx 

4>{x) 
vergent",  write  a  translation  of  "If  <i>{x)  is  an  increasing  positive  function 

of  X  for  which  Y]  77^  is  divergent,  then   /     777-  is  divergent.     On  the 
ti.  4>(n)  Ji    4>{x) 

other  hand,  if  Y]  — -r  is  convergent,  then  /     77-7  is  convergent". 
t^i(i){n)  *^  J I    (f){x) 

11.1.22.  If  "P",  "Q",  and  "R"  are  translations  of  "A  is  a  right  angle", 
"i5  is  a  right  angle",  and  "A  =  B",  write  a  translation  of  "Any  two  right 
angles  are  equal". 

11.1.23.  If  "P",  "Q",  and  "R"  are  translations  of  "a  is  a  unity", 
"The  norm  of  a  is  +1",  and  "The  norm  of  a  is  —  1",  wTite  a  translation  of 
"The  norm  of  a  unity  is  ±1,  and  every  number  whose  norm  is  ±1  is  a 
unity". 

2.  The  Dot  Notation.  It  will  be  noticed  that,  by  our  conventions  for 
omission  of  parentheses,  we  have  established  a  sort  of  hierarchy  of  symbols. 
"'^"  is  weakest  in  the  sense  that  we  make  the  interpretations  listed  in  the 
following  table,  where  the  formula  on  the  right  is  the  interpretation  of  that 
on  the  left : 

^PQ     ('-P)Q 

-'PvQ (~P)vQ 

'^P  D  Q (~P)  D  Q 

~P  =  Q (~P)  =  Q. 

Thus  "'-"  is  weaker  than  each  of  "&",  "v",  'O  ",  and  "^".  The  next 
weakest  is  "&",  as  is  shown  by  the  following  interpretations: 

PQvR (PQ)yR 

PwQR PviQR) 

PQ  D  R (PQ)  D  R 

P  D  QR P  D  (QR) 

PQ^  R (PQ)  ^  R 

P  =  QR P  ^  (QR). 
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Next  weakest  is  ''v",  as  shown  by  the  interpretations: 

PwQ  D  R  (PvQ)  3  R 

P  D  QvR  P  D  (QyR) 

PyQ  ^  R  (PvQ)  ^  R 

P  ^  QyR  P  ^  (QyR). 

The  symbols  "D"  and  "^"  are  of  equal  strength,  since  "P  D  Q  =  R" 
and  "P  =  Q  D  R"  are  ambiguous. 

If  we  write  "PQ"  with  an  "&",  the  ampersand  is  still  weaker  than 
any  symbol  except  "'~",  as  shown  by  the  interpretations: 

~P&Q (~P)&Q 

P&QyR      (P&Q)yR 

PyQ&R      Py(Q&R) 

P&Q  D  R (P&Q)  D  R 

P  D  Q&R P  D  (Q&R) 

P&Q  =  R (P&Q)  ^  R 

P  =  Q&R P  =  {Q&R). 

There  will  be  occasions  when  a  strong  "&"  would  be  useful,  and  for  this 
reason,  we  agree  that  if  we  write  "PQ"  with  a  dot,  "P.Q",  it  is  then  stronger 
than  any  other  symbol.    That  is,  we  agree  on  the  following  interpretations:' 

P.QyR P(QyR) 

P.QD  R P(Q  D  R) 

P.Q^R P(Q  ^  R) 

PyQ.R {PyQ)R 

P  D  Q.R (P  D  Q)R 

P  ^  Q.R (P  ^  Q)R 

P.Q  ^  RS P(Q  ^  (RS)) 

P.Q  ^  R.S P(Q  =  R)S. 

By  association  to  the  left,  the  last  formula  on  the  right  would  be  interpreted 
as  "{P(Q  =  R))S'\ 

The  next  stage  in  the  dot  notation  is  to  think  of  the  dot  not  as  standing 
for  a  logical  product,  but  as  strengthening  whatever  symbol  it  stands  by. 
In  the  cases  above,  it  could  be  understood  that  the  dot  is  standing  by  an 
implicit  "&"  and  strengthening  it.  Now  we  allow  the  dot  to  stand  by  any 
symbol  and  strengthen  it.  As  none  of  "v",  "D'\  or  "  =  "  can  be  implicit, 
the  dot  must  stand  on  one  side  or  the  other,  in  which  case  it  strengthens 
that  side  only.  If  we  wish  to  strengthen  both  sides,  we  use  two  dots,  one 
on  each  side.  Thus  v.^e  have  the  following  interpretations,  in  which  the 
right-hand  column  contains  the  interpretations,  and  the  middle  column 
contains  an  intermediate  formula  obtained  by  inserting  the  pair  of  paren- 
theses indicated  by  the  dot. 
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P  D  QvR  ^  S  .    .    . 

P    3     ,QyR    ^    S          .       . 

.    .  P  D  (QyR  ^  S)    .    .    .    . 

P  D  QwR.  ^  S     .    . 

.    .   (P  D  QyR)  ^  S    .    .    .    . 

P  D  Q.wR  ^  S     .    . 

.    .   (P  D  Q)yR  ^  S    .    .    .    . 

P  D  Qw.R  ^  S     .    . 

.    .  P  D  Qy(R  ^  S)    .    .    .    . 

P  D  Q.y.R  ^  S    .    . 

.    .   (P  D  Q)y{R  ^  S)     .    .    . 

Ambiguous 
P  D  ({QyR)  ^  S) 
(P  D  (QyR))  ^  S 
((P  D  Q)yR)  s  S 
P  D  (Qy{R  =  S)) 
(P  D  Q)y{R  ^  S) 

Note  that,  as  we  said  above,  a  dot  on  one  side  of  a  symbol  strengthens  that 
side  only.  Thus,  only  in  the  last  illustration,  where  the  "v"  is  strengthened 
on  each  side,  is  the  "v"  the  dominant  symbol  of  the  formula.  On  the  other 
hand,  a  dot  standing  for  ''&"  operates  in  both  directions. 

Any  symbol  strengthened  by  a  dot  is  stronger  on  that  side  than  any 
symbol  not  strengthened  by  a  dot.  If  two  symbols  are  each  strengthened 
by  a  dot,  then  they  have  the  same  strength  relative  to  each  other  as  though 
no  dot  were  present.    Thus  we  have  the  interpretations: 


P.QyR.  D  S 

P  ^  .Q  D  R.yS     ... 

However,  notice  the  interpretations: 

P,Q  =  ,RyS  D  T      .    . 

P,Q  =  R,yS 

P.Q  ^  .R  D  S.yT     .    . 


(PiQyR))  D  S 

P  ^  ({QD  R)yS). 


P(Q  ^  ((RyS)  D  T)) 

(P(Q  ^  R))yS 

P(Q  ^  ((R  D  S)yT)). 


In  the  first  of  the  last  three  illustrations  the  dot  on  the  right  of  the  "  =  " 
strengthens  it  on  the  right  so  that  it  is  stronger  than  the  "  D  ".  However, 
the  "  =  "  is  not  strengthened  on  the  left,  and  so  the  dot  between  "P"  and 
"Q"  is  stronger.  In  the  second,  the  dot  on  the  left  of  "v"  strengthens  it  and 
so  it  and  the  implicit  "&"  between  the  "P"  and  "Q"  have  the  usual  strength 
relative  to  each  other.  However,  in  the  third,  the  dot  on  the  left  of  the  ''v" 
strengthens  it,  but  the  "="  with  the  dot  on  the  right  is  even  stronger,  to 
the  right.  On  the  left  the  "  =  "  is  not  strengthened,  and  the  dot  between 
the  "P"  and  "Q"  is  stronger. 

A  good  way  to  treat  such  formulas  is  to  replace  each  dot  in  succession 
by  a  pair  of  parentheses,  starting  with  the  strongest  dot  in  the  formula, 
and  working  down.  The  strongest  dot  is  that  next  to  the  strongest  symbol, 
of  course.  Thus,  in  the  last  formula  considered,  we  could  successively 
replace  dots  by  pairs  of  parentheses  as  shown  in  the  table  below : 

P.Q  ^  .R  D  S.yT 
P.Q  ^  {R  D  S.yT) 
P.Q  ^  ((R  D  S)yT) 
P(Q  ^  ((R  D  S)yT)). 
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The  dot  next  to  the  "  =  ",  being  strongest,  is  first  replaced  by  a  pair  of 
parentheses,  giving  the  second  formula  of  the  table.  Then  the  dot  next 
to  the  "v"  is  replaced  by  the  appropriate  pair  of  parentheses,  and  finally  the 
dot  between  the  "P"  and  "Q". 

Rather  commonly,  if  there  are  two  dots  of  equal  strength,  either  they  are 
quite  independent  of  each  other,  or  else  the  formula  is  ambiguous.  Thus, 
in  "P  D  Q.y.R  =  *S",  the  dots  of  equal  strength  are  quite  independent, 
whereas  in  the  formula  "PvQ  D  .R  =  P.  D  Q"  the  two  dots  of  equal 
strength  conflict  and  the  formula  is  ambiguous,  since  we  would  have  to 
replace  the  dots  by  parentheses  as  follows:  "(PyQ)  D  (R  =  P)  D  Q",  and 
this  is  ambiguous.  In  other  cases  of  conflict,  the  ambiguity  is  sometimes 
relieved  by  special  rules.  Thus,  in  "P.Q  =  R.S",  we  must  replace  the  dots 
by  parentheses  as  follows:  "P(Q  =  R)S",  but  this  is  rendered  unambiguous 
by  the  rule  of  association  to  the  left.    Likewise,  in 

P  D  (Q  D  R).  ^  .PQ  D  R.  ^  .Q  D  (P  D  R) 

we  have  to  replace  the  dots  by  parentheses  as  follows : 

{P  D  (Q  D  R))  ^  (PQ  D  R)  ^  (QD  (P  D  R)). 

By  the  algebraic  analogy,  this  is  then  interpreted  as 

((P  D  (QD  R))  ^  (PQ  D  R))&((PQ  D  R)  ^  (QD  (P  D  R))). 

If  we  wish  to  strengthen  a  symbol  very  much,  we  will  put  a  double  dot, 
":",  or  triple  dot,  ".:"  or  ":.",  or  even  a  quadruple  dot,  "::"  next  to  the 
sjonbol.  A  symbol  strengthened  by  a  double  dot  is,  on  that  side,  stronger 
than  any  symbol  with  a  single  dot  or  with  no  dot.  If  two  symbols  are  both 
strengthened  by  a  double  dot,  then  they  have  the  same  strength  relative  to 
each  other  as  though  no  dots  were  present.  Similar  considerations  apply 
to  symbols  strengthened  by  triple  or  quadruple  dots.  A  symbol  can  have  a 
different  number  of  dots  on  the  two  sides  of  it,  including  the  possibility  of 
no  dots  at  all  on  one  side.    We  have  the  illustrations: 

P:.Q  ^  .R  D  S-.yT  .  .  .  P((Q  ^  (R  D  S))wT) 
P:Q  ^  .R  D  S-.wT  ....  (P(Q  ^  (R  D  S)))yT 
P:Q  ^  :R   D  S:wT   ....      P(Q  ^  ({R   J  S)yT)). 

The  last  of  these  is  just  a  repetition  with  double  dots  of  an  illustration 
that  was  given  earlier  with  single  dots. 

In  replacing  multiple  dots  by  parentheses,  it  is  advisable  to  start  with 
the  strongest. 

We  shall  feel  free  to  use  more  dots  or  parentheses  than  the  minimum 
required  to  dispel  ambiguity  if  we  thereby  make  the  statement  easier  to 
read.     Thus,  although  the  statement  "P{Q   =    ({R   D   S)wT))"  can  be 
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unambiguously  rendered  as  "P.Q  =  .72  D  S.vT",  it  will  be  much  easier  to 
interpret  if  wi'itten  as  "P-.Q  =  .72  D  S.yT".  Here  it  is  immediately  obvious 
that  the  double  dot  dominates  the  formula,  whereas  in  "P.Q  =  .72  D  S.yT" 
a  careful  analysis  must  be  made  before  one  finds  out  that  the  dot  next 
to  the  "  =  "  controls  the  dot  next  to  the  "v",  thus  accidentally  leaving  the 
dot  between  the  "P"  and  "Q"  in  a  dominant  position. 

It  is  this  possibility  of  making  the  dominant  symbol  of  a  formula  readily 
apparent  by  the  use  of  many  dots  next  to  it  that  is  the  real  advantage  of 
the  dot  notation.  If  one  always  uses  a  minimum  number  of  dots,  this 
advantage  is  lost. 

Like  the  use  of  abbreviations  such  as  ''PvQ"  and  omission  of  parentheses, 
use  of  dots  is  not  part  of  our  symbolic  logic,  but  only  a  convenience  which 
we  permit  ourselves  in  talking  about  it. 

EXERCISES 

11. 2.1.  By  using  dots  rewrite  the  following  unambiguously  without 
parentheses : 

(a)     P^Q(P  D  Q). 

(b)         (P    ^    Q)     ^    PQy^P^Q, 

(c)  (P  D  R)(Q  D  R)  D  ((PvQ)  D  R). 

(d)  (PQ)R  D  P(QR). 

(e)  (P  D  (Q  3  R))  ^  ((PQ)  D  R). 

(f)  PQR  D  ((P  ^Q)  ^  R). 

(g)  (P  D  (QJ  R))  ^  (QJ  (P  D  P)). 
(h)  PD  (Q^  {P  ^  Q)). 

(i)      (P  D  Q)  D  ((P  D  (QD  R))  D  (P  D  R)). 

11.2.2.  Rewrite  the  following  unambiguously  without  the  use  of  dots: 

(a)  '^Q  D  ~P.  =  .P  D  Q. 

(b)  P  D  Q.^P  D  Q.^Q. 

(c)  PQyR.  ^  .PwR.QyR. 

(d)  PwQ  ^  :P  D  Q.D  Q. 

(e)  Q  =  S.  D  :R  ^  T.  D  .QR  ^  ST. 

(f)  P  D  :Q  ^  ^Q.  D  -P. 

(g)  P  D  .Q  ^  R:  ^  :P  D  Q.  ^  .P  D  R. 
(h)  Q  D  ~P.  D  -.PyQ.R.  =  PR. 

3.  The  Use  of  Truth-value  Tables.  Suppose  we  start  with  several 
distinct  statements  "Pi",  "P2",  •  •  •  ,  "Pn"  and  combine  them  by  means  of 
"'^"  and  "&"  to  form  a  compound  statement  "Q",  using  each  as  often 
as  desired.     Then  "Q"  will  be  called  a  statement  formula  derived  from 
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"Pi\  "P2',  •  •  ,  "Pn",  or  more  simply  just  a  statement  formula.  "Pi", 
"P2",  .  .  .  ,  "Pn"  will  be  called  the  components  of  "Q'\  Thus  "~P"  is  a 
statement  formula  with  the  component  "P",  and  "PwQ"  (which  is  really 
"'^(^^P&'^Q)'^)  is  a  statement  formula  with  the  components  "P"  and 
"Q",  and  so  on. 

In  setting  up  the  above  definition  of  a  statement  formula,  we  have  taken 
the  first  step  toward  a  precise  definition  of  "statement."  We  are  still  a 
long  way  from  a  complete  or  precise  definition,  but  one  part  of  the  definition 
has  now  been  adopted,  namely,  that,  if  "P"  and  "Q"  are  any  statements, 
then  "~P"  and  "{P&Q)"  are  also  statements. 

The  study  of  statement  formulas,  with  especial  reference  to  their  truth 
or  falsity,  constitutes  the  statement  calculus. 

If  one  ■wi'ites  dovvTi  a  statement  formula  at  random,  such  as  "P"  or 
"PyQ'\  it  is  likely  that  the  truth  or  falsity  of  it  will  depend  on  the  truth  or 
falsity  of  its  component  statements  "P",  "Q",  etc.,  and  if  one  does  not 
have  information  about  the  truth  of  the  components,  one  cannot  come  to 
any  decision  as  to  the  truth  of  the  formula.  However,  there  are  some 
statement  formulas  whose  truth  or  falsity  does  not  depend  upon  the  truth 
or  falsity  of  their  components.  For  instance,  "PQ  D  P"  is  true  no  matter 
what  statements  "P"  and  "Q"  are  used  in  it,  and  "Pr^P"  is  false  regardless, 
of  what  "P"  is. 

A  statement  formula  which  is  true  no  matter  what  statements  are  taken 
in  it  is  said  to  be  "universally  valid." 

We  now  present  a  simple  scheme  for  identifying  those  statement  formulas 
which  are  universally  valid.  This  method  depends  on  the  principle  that 
each  statement  is  either  true  or  false.  So  the  truth  value  of  a  statement 
must  be  either  T  (for  truth)  or  F  (for  falsehood).  Now,  if  one  knows  the 
truth  value  of  "P",  one  can  immediately  infer  the  truth  value  of  "^^P". 
If  "P"  is  T,  then  "~P"  is  F,  and  if  "P"  is  F,  then  "~P"  is  T.  We  can 
summarize  this  in  a  "truth-value  table"  for  "'~"  as  follows: 


p 

~P 

T 
F 

F 
T 

In  this  table,  we  have  listed  under  "P"  the  possible  truth  values  which  it 
can  take,  and  under  "-^P"  the  corresponding  truth  values  that  it  would 
take. 

Correspondingly,  if  one  knows  the  truth  values  of  "P"  and  "Q",  one  can 
immediately  infer  the  truth  value  of  "PQ".  We  summarize  the  way  of 
doing  this  in  a  truth-value  table  for  "&": 
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p 

Q 

PQ 

T 

T 

T 

T 

F 

F 

F 

T 

F 

F 

F 

F 

Under  the  pair  "F",  "Q"  we  have  hsted  the  four  pairs  of  truth  values 
which  "P"  and  "Q"  can  take,  and  under  "PQ"  the  corresponding  truth 
values  that  it  would  take.  We  feel  that  these  truth  values  are  unequivocally 
determined  by  our  agreement  that  "&"  is  to  serve  as  the  equivalent  in 
symbolic  logic  of  the  English  word  "and",  and  hence  that  we  need  not  ex- 
plain the  entries  under  "PQ". 

By  using  the  two  truth-value  tables  above,  one  can  compute  a  truth-value 
table  for  any  statement  formula,  since  it  is  defined  in  terms  of  "'^"  and 
"&".  Thus  let  us  construct  a  truth-value  table  for  "P  D  Q".  This  is 
built  up  from  "P"  and  "Q"  by  the  following  sequence  of  steps: 

1.  Put  together  ''~"  and  "Q"  to  get  "~Q". 

2.  Put  together  "P"  and  "-'Q"  to  get  'T~Q". 

3.  Put  together  "^"  and  "P^Q"  to  get  "^(P^Q)". 

If  we  start  with  any  pair  of  truth  values  for  "P"  and  "Q'\  we  can  get 
the  truth  value  for  "P  D  Q"  by  going  through  the  same  sequence  of  steps. 
We  summarize  these  calculations  in  the  following  truth-value  table  for 


p 

Q 

'^Q 

P^Q 

P  D  Q 

T 

T 

F 

F 

T 

T 

F 

T 

T 

F 

F 

T 

F 

F 

T 

F 

F 

T 

F 

T 

Similarly  we  get  the  truth-value  table  for  "v": 


p 

Q 

^P 

^Q 

^P^Q 

PvQ 

T 

T 

F 

F 

F 

T 

T 

F 

F 

T 

F 

T 

F 

T 

T 

F 

F 

T 

F 

F 

T 

T 

T 

F 

Use  of  these  tables  can  often  shorten  the  labor  of  constructing  tables  for 
more  complicated  statement  formulas.    Thus  we  get  the  following  truth- 
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value  table  for  "  =  "  by  thinking  of  "P  =  Q"  as  built  up  out  of  "P  D  Q" 
and"Q  D  P": 


p 

Q 

P  D  Q 

QD  P 

P^Q 

T 

T 

T 

T 

T 

T 

F 

F 

T 

F 

F 

T 

T 

F 

F 

F 

F 

T 

T 

T 

The  entries  in  the  columns  under  "P  D  Q"  and  "Q  D  P"  are  written 
dowTi  directly  by  referring  to  the  truth-value  table  for  "  D  ". 

We  consider  it  as  completely  obvious  that  a  statement  formula  is  uni- 
versally valid  if  and  only  if  it  takes  the  truth  value  T  for  every  combination 
of  truth  values  of  its  components.  As  we  can  check  whether  the  latter  is 
the  case  by  constructing  a  truth-value  table  for  the  statement  formula,  we 
now  have  a  means  of  testing  whether  a  statement  formula  is  universally 
valid.  Thus,  to  show  that  "PQ  D  P"  is  universally  valid,  we  construct 
its  truth-value  table : 


p 

Q 

PQ 

PQD  P 

T 

T 

T 

T 

T 

F 

F 

T 

F 

T 

F 

T 

F 

F 

F 

T 

Since  only  T's  appear  in  the  last  column,  ''PQ  D  P"  is  universally  vahd. 

If  one  has  a  statement  formula  based  on  three  or  more  statements, 
construction  of  a  truth-value  table  becomes  a  fairly  extended  enterprise. 
Thus,  we  wish  to  show  that  "P  D  Q.  3  .^(QR)  3  ~(i2P)"  is  universally 
valid.  If  we  denote  this  by  "S",  we  construct  the  following  truth-value 
table : 


p 

Q 

R 

PD  Q 

QR 

^{QR) 

RP 

'-"{RP) 

'^(QR)  D  ~(/?P) 

S 

T 

T 

T 

T 

T 

F 

T 

F 

T 

T 

T 

T 

F 

T 

F 

T 

F 

T 

T 

T 

T 

F 

T 

F 

F 

T 

T 

F 

F 

T 

T 

F 

F 

F 

F 

T 

F 

T 

T 

T 

F 

T 

T 

T 

T 

F 

F 

T 

T 

T 

F 

T 

F 

T 

F 

T 

F 

T 

T 

T 

F 

F 

T 

T 

F 

T 

F 

T 

T 

T 

F 

F 

F 

T 

F 

T 

F 

T 

T 

T 
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To  show  that  "P  D  Q.  D  .^{QR)  3  ^(RP)"  always  takes  the  value  T 
involves  the  inspection  of  eight  cases.  This  suggests  that  perhaps  we  can 
proceed  more  quickly  if  we  try  to  show  that  it  could  never  take  the  value  F; 
and  indeed  we  can. 

Suppose  there  is  a  set  of  truth  values  for  "P",  "Q",  and  "R"  which 
makes  "P  D  Q.  D  .^(QR)  D  ^(RP)"  take  the  value  F.  Inspection  of 
the  truth-value  table  for  "3"  discloses  that  this  can  happen  only  in  case 
we  have: 

(a)  The  truth  value  T  for  "P  D  Q". 

(b)  The  truth  value  F  for  "'^{QR)  D  '^(RPy\ 

From  (b),  by  the  truth-value  table  for  "  D  ",  we  see  that  we  must  have: 

(c)  The  truth  value  T  for  ''^(QRy. 

(d)  The  truth  value  F  for  "~(i^P)". 

From  (d),  we  see  that  we  must  have: 

(e)  The  truth  value  T  for  "RP". 
From  this  follows  that  we  must  have: 

(f)  The  truth  value  T  for  ''R". 

(g)  The  truth  value  T  for  "P". 

From  (g)  and  (a),  we  see,  by  the  truth-value  table  for  "D",  that  we 
must  have: 

(h)     The  truth  value  T  for  "Q". 

As  (f)  and  (h)  contradict  (c),  we  conclude  that  "P  D  Q.  D  .^(QR)  D 
^{RP)"  never  has  the  value  F.  As  it  must  have  either  the  value  F  or  T 
in  each  case,  it  must  take  the  value  T  in  all  cases. 

We  notice  that  the  procedure  of  constructing  a  truth-value  table  for  a 
statement  formula  is  purely  mechanical  and  depends  entirely  on  the  form 
of  the  statement  and  not  in  the  least  on  the  meaning  of  the  statement. 
One  could  certainly  build  a  machine  to  construct  truth-value  tables  for 
statement  formulas.  Thus  we  can  permit  the  use  of  truth-value  tables  as 
part  of  our  symbolic  logic. 

The  suggested  short  cut,  which  we  just  explained,  of  assuming  a  state- 
ment formula  to  have  the  value  F  and  deriving  a  contradiction  is  not  a 
purely  mechanical  process,  though  it  does  not  depend  in  any  way  on  the 
meaning  of  the  statement.  We  may  take  the  following  point  of  view  with 
respect  to  it.  An  intelligent  person,  faced  with  the  prospect  of  constructing 
a  fairly  extensive  truth-value  table,  might  seek  a  line  of  reasoning  whereby 
he  could  learn  what  would  be  the  result  of  constructing  the  truth-value 
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table  without  actually  going  to  the  trouble  of  constructing  the  table.  Our 
suggested  short  cut  is  one  such  possible  line  of  reasoning.  We  cannot 
accept  such  short  cuts  as  part  of  our  symbolic  logic,  because  we  are  carefully 
restricting  our  symbolic  logic  to  purely  mechanical  processes.  If  a  person 
wishes  to  operate  purely  within  the  symbolic  logic,  he  must  actually  con- 
struct a  truth-value  table  in  each  case. 

Nonetheless,  the  suggested  short  cut  is  a  convenient  way  of  reassuring 
oneself  that,  if  one  should  construct  the  truth-value  table,  one  would  indeed 
always  get  the  truth  value  T  for  the  statement  in  question.  Thus  our  short 
cut,  though  not  part  of  the  symbolic  logic,  gives  us  valuable  information 
about  the  symbolic  logic. 

We  distinguish  sharply  between  operations  within  our  symbolic  logic, 
and  reasoning  about  the  symbolic  logic.  The  method  of  truth-value  tables, 
being  purely  mechanical,  can  be  admitted  as  a  procedure  within  our  logic. 
Our  short  cut,  being  not  purely  mechanical,  can  be  allowed  only  as  a  means 
outside  the  logic  of  getting  information  about  the  logic. 

As  we  shall  have  many  occasions  in  the  future  to  prove  things  about  our 
logic,  we  should  perhaps  indicate  roughly  now  what  we  shall  consider  as 
acceptable  methods  of  proof  about  our  logic.  We  shall  avoid  all  abstruse 
and  complex  methods  of  proof.  Our  reasoning  must  be  constructive  in  the 
sense  that,  if  we  claim  the  existence  of  anything,  we  must  explain  explicitly 
how  to  construct  it.  We  shall  also  avoid  indirect  reasoning.  This  means 
that  we  reject  our  short  cut  even  as  a  means  of  reasoning  about  the  sym- 
bolic logic,  since  our  short  cut  is  an  indirect  argument,  depending  on 
reductio  ad  absurdum. 

Actually,  we  can  rewrite  our  short  cut  as  a  direct  argument  as  follows. 
We  note  first  from  the  truth-value  table  for  ''  D  "  that  if  ''F"  has  the  value 
T,  then  so  does  "U  D  V",  regardless  of  the  value  of  "U".  Likewise,  if 
"C7"  has  the  value  F,  then  "U  D  V"  has  the  value  T  regardless  of  the 
value  of  "F".  Likewise,  if  "U"  has  the  value  F,  then  so  do  "UV"  and 
"VU",  regardless  of  the  value  of  "V".  Using  these  observations,  we  now 
proceed  as  follows.  If  "P"  has  the  value  F,  then  so  does  "RP".  Hence 
"'^{RPy  has  the  value  T,  whence  we  infer  that  "^(QR)  D  ~(72P)"  has 
the  value  T,  whence  we  infer  that  "P  D  Q.  D  .^(QR)  3  ^(RPy  has  the 
value  T.  This  disposes  of  the  case  when  "P"  has  the  value  F.  We  now 
let  "P"  have  the  value  T.  If  "R"  has  the  value  F,  we  can  repeat  the  argu- 
ment above.  This  leaves  the  case  where  "P"  and  "R"  both  have  the  value 
T.  Under  this,  we  have  two  cases,  namely,  "Q"  has  the  value  T  and  "Q" 
has  the  value  F.  In  the  latter  case  "P  D  Q"  has  the  value  F,  and  so 
"P  D  Q.  D  .^(QR)  D  ^{RPy  has  the  value  T.  In  the  former  case, 
"Q"  and  "P"  each  have  the  value  T,  so  that  "^(QPy  has  the  value  F, 
so  that  "~(QP)  3  ^(RPy  has  the  value  T,  and  likewise  "P  D  Q.  D  . 
r^{QR)  D  '^{RPy  has  the  value  T. 
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This  argument  can  be  summarized  in  the  following  abridged  truth-value 
table,  in  which  again  "S"  denotes  "P  D  Q.  D  .^(QR)  D  '^(RP)". 


p 

Q 

R 

PD  Q 

QR 

^iQR) 

RP 

^{RP) 

'^{QR)  D  ^{RP) 

S 

F 

F 

T 

T 

T 

T 

F 

F 

T 

T 

T 

T 

T 

T 

T 

F 

T 

T 

T 

F 

T 

F 

T 

Construction  of  such  an  abridged  truth-value  table  is  hardly  a  mechan- 
ical process  and  so  cannot  be  admitted  as  part  of  our  symbolic  logic. 
However,  it  is  a  direct  and  simple  argument  and  so  can  be  admitted  as  a 
means  of  obtaining  information  about  the  symbolic  logic. 

The  distinction  between  operations  which  we  permit  within  the  symbolic 
logic  and  reasoning  which  we  permit  about  the  symbolic  logic  will  become 
clearer  as  we  proceed.  However,  we  state  the  crucial  point  again,  that 
operations  within  the  symbolic  logic  must  be  purely  mechanical.  When 
reasoning  about  the  symbolic  logic,  we  permit  nonmechanical  reasoning 
processes,  but  we  permit  only  very  simple  and  direct  reasoning  processes. 


(a) 

(b) 

(c) 

(d) 

(e) 

(f) 

(g) 

(h) 

(i) 

(J) 

(k) 

(1) 

(m) 

(n) 
(o) 
(P) 
(q) 


EXERCISES 

II.3.1.     Prove  that  the  following  are  universally  valid: 

P  D  Q.Q  D  R.  D  .P  D  R. 
~Q  D  ~P.  D  .P  J  Q. 


P  D  R.Q  D  R.  D  .FvQ  D  R. 

P  D  Q.~P  D  Q.  D  Q. 

~P  D  Qr^Q.  D  P. 

P  D  Q^Q.  D  '^P. 

^P  D  P.  D  P. 

P  D  ~P.  D  -P. 

P~Q  D  R^R.  D  .P  D  Q. 

P~Q  D  Q.  D  .P  D  Q. 

p^Q  D  ^p.  D  .P  D  Q. 

P  D  Q.^P  D  R.  D  QyR. 

^P  D  Q.  D  PwQ. 

P  D  Q.  D  Qy^P. 

p  ^  Q^  0  ^Q  ^  p, 

P  ^  Q^Q  ^  R^   0   ^p  ^  P, 
Py-^P. 


II.3.2.     Prove  that  the  following  are  universally  valid : 

(a)  P  D  PP. 

(b)  Q  ^  S.  D  .^Q  ^  ^S. 
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(c)  Q  ^  S.D  :R^  T.D  .QR^  ST. 

(d)  P  D  Q.  D  :.P  :)  .Q  D  R:  D  .P  D  R. 

(e)  P  ^  ~P.  D  Q^Q. 

(f)  P  D  :Q  ^  ^Q.  D  '-P. 

(g)  P  D  .-QD  -{P  D  Q). 
(h)     PwQ  ^  -.P  D  Q.  D  Q. 

(i)      Q  J  R.  D  :.P  D  .R  D  S:  D  :P  D  .Q  D  S. 
(j)     PQ  D  R.  =  .P'^R  D  ~Q. 

11.3.3.  Let  "Pi",  "P2",  .  .  .  ,  "P„",  "i^,",  "P2",  .  .  .  ,  "R."  be  any 
statements;  let  "Q"  be  the  logical  product  of  "~(P,P,)"  for  every  i  and; 
with  1  <  i  <  i  <  n.  Then  for  1  <  fc  <  n,  show  that  "QP^  3  .P1P1VP2P2V 
.  .  .  vP„P„  ^  Pt"  is  universally  valid. 

11.3.4.  Write  six  statement  formulas  which  always  take  the  value  F, 
and  prove  that  they  do  so. 

IL3.5.  Write  a  logical  sum  of  several  of  "PQR",  "PQ^R",  "P'^QR'', 
"P^Q^R",  "^PQR",  "'^PQ'^R",  "'^P'^QR",  and  "~P~Q~P" 
which  will  take  the  value  T  for  the  following  sets  of  values  of  "P",  "Q", 
and"P": 


P 

T 

T 

F 

F 

F       • 

Q 

T 

F 

T 

F 

F 

R 

T 

T 

T 

T 

F 

and  will  take  the  value  F  for  all  other  sets  of  values  of  "P",  "Q",  and  "P". 

11.3.6.  Prove  that  "PQ  =  QP"  and  "PvQ  =  QvP"  are  universally  valid. 
Since  these  are  analogous  to  ^'xy  =  yx"  and  "x  -\-  y  =  y  -\-  x'\^Ne  express 
them  by  saying  that  "&"  and  "v"  are  commutative. 

11.3.7.  Prove  that  '\PQ)R  -  P(QP)"  and  ''(PyQ)wR  ^  Pv(QvP)"  are 
universally  valid.  Since  these  are  analogous  to  "{xy)z  =  x(yz)"  and 
"(a;  -\-y)-\-z  —  x-'riy-^  z)",  we  express  them  by  saying  that  "&"  and 
"v"  are  associative. 

11.3.8.  Prove  that  'XPyQ)R  =  PRwQR"  is  universally  valid.  Since 
this  is  analogous  to  "(x  +  y)z  =  xz  +  yz",  we  express  it  by  saying  that 
"&"  is  distributive  with  respect  to  "v". 

11.3.9.  Prove  that  ''PQyR  =  (PyRXQwR)"  is  universally  valid.  This 
is  quite  unlike  any  familiar  algebraic  principle.  We  express  it  by  saying 
that  "v"  is  distributive  with  respect  to  "&". 

4.  Applications  to  Mathematical  Reasoning.  For  this,  we  give  meanings 
to  our  formulas  of  symbolic  logic.  These  meanings  constitute  principles 
of  reasoning.  Some  formulas  state  invalid  principles  of  reasoning.  Thus 
the  meaning  of  "P  D  Q.  D  .Q  D  P"  is  that,  if  an  implication  is  true,  then 
its  converse  is  also  true,  which  is  not  so.    Other  formulas  state  valid  princi- 
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pies  of  mathematical  reasoning;  in  particular  any  statement  formula  which 
always  takes  the  value  T  states  a  valid  principle  of  mathematical  reasoning. 
Any  principle  of  this  sort  is  likely  to  be  rather  elementary  when  considered 
as  a  theorem  of  mathematics.  Deeper  methods  of  proof  are  needed  to 
establish  any  really  difficult  theorems.  Nonetheless,  universally  valid 
statement  formulas  can  be  used  to  justify  a  number  of  quite  useful  prin- 
ciples of  mathematical  reasoning.  Indeed,  one  value  of  the  statement 
calculus  is  that  in  effect  it  is  a  reservoir  of  such  useful  principles  of  mathe- 
matical reasoning.  Each  statement  formula  listed  in  Ex.  II. 3.1  is  a  state- 
ment formula  from  which  one  can  derive  useful  principles  of  mathematical 
reasoning.    We  now  derive  and  illustrate  these  principles. 

1.  By  a  study  of  the  truth-value  table  for  "3"  we  can  verify  the 
following  principle,  known  as  "modus  ponens": 

If  "P"  and  "P  D  Q"  are  both  proved,  then  one  is  entitled  to  infer  that 
"Q"  is  proved. 

When  use  is  made  of  the  rule  of  modus  ponens,  "P  D  Q"  is  called  the 
"major  premise"  and  "P"  is  called  the  "minor  premise". 

Rather  commonly,  if  one  has  "P  D  Q"  proved,  one  proved  it  by  assuming 
"P"  and  deducing  "Q".  In  such  a  case,  if  one  has  "P"  proved,  then  the 
deduction  of  "Q"  from  "P"  would  constitute  a  proof  of  "Q"  and  we  do  not 
need  to  use  modus  ponens  (unless  it  is  used  in  the  deduction  of  "Q"  from 
"P").  However,  even  if  "P  D  Q"  was  proved  by  some  other  means,  we 
can  nonetheless  infer  "Q"  by  modus  ponens  if  "P"  is  known  true.  An 
illustration  of  this  would  be  the  following.  Suppose  one  has  proved 
"~Q  D  ~P".  By  Ex.  11.3.1(b),  we  have  also  proved  "~Q  D  ~P.  D  . 
P  D  Q".  So  we  use  modus  ponens  with  "'^Q  D  '^P.  D  .P  D  Q"  as  the 
major  premise  and  "'^Q  D  ^^P"  as  the  minor  premise  and  infer  "P  D  Q". 
If  now  "P"  were  also  provable,  we  could  use  modus  ponens  again  to  infer 

An  obvious  generalization  is  that,  if  one  has  proved  each  of  "Pi", 
"Pi  D  P,",  "P2  D  P3",  .  .  .  ,  "Pn-i  3  Pn",  then  one  can  infer  "P„". 
Practical  use  is  made  of  this  generalization  in  the  so-called  "analytic 
method  of  proof"  in  solving  problems  in  elementary  geometry.  We  quote 
a  selected  explanation  of  this  method  (Stone  and  Mallory,  page  68)  :^ 

"The  steps  of  an  analysis  may  be  expressed  symbolically  as  follows:  If  A 
is  to  be  proved,  reason  thus:  A  will  be  true  if  B  can  be  proved  true;  B  will 
be  true  if  C  can  be  proved  true;  C  will  follow  if  D  can  be  proved  true;  but 
D  is  given  true;  hence  begin  by  proving  that  C  is  true." 

2.  If  "P"  and  "Q"  are  proved,  we  can  infer  "PQ".  This  important 
principle  is  used  in  constructing  our  truth-value  table  for  "&". 

1  From  "Modern  Geometry"  by  John  C.  Stone  and  Virgil  S.  Mallory,  copyright 
1930,  com-tesy  of  Benj.  H.  Sanborn  &  Co. 
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An  illustration  of  the  use  of  this  principle  would  come  in  proving  two 
lines  to  be  parallel  in  solid  geometry.  In  solid  geometry,  the  definition  of 
parallel  lines  involves  a  logical  product,  to  wit:  Two  lines  are  parallel  if 
(a)  they  are  coplanar,  (b)  they  never  meet.  To  prove  two  lines  parallel  one 
proves  separately  each  of  the  two  factors  of  the  logical  product,  and  then 
infers  their  logical  product.    For  instance: 

Theorem.  The  intersections  of  two  parallel  planes  by  a  third  plane 
are  parallel  lines  (see  Fig.  II. 4.1).  . 


Given:  Parallel  planes  EF  and  GH,  intersected  in  lines  AB  and  CD  by 
plane  ABCD. 

To  prove:     AB  parallel  to  CD. 

1.  AB  and  CD  are  coplanar,  since  both  lie  in  plane  ABCD. 

2.  AB  and  CD  never  meet,  for  if  they  met,  the  point  at  which  they  meet 
would  be  a  point  at  which  planes  EF  and  GH  meet. 

3.  Therefore  AB  and  CD  are  parallel. 

Q.E.D. 

Generalization  to  logical  products  of  more  than  two  factors  is  immediate. 
In  general,  if  ''Pi",  'T,",  .  .  .  ,  "P„"  are  proved,  we  can  infer  'T,P^  ■  •  ■  P„". 
As  an  instance  of  a  proof  of  a  logical  product  of  four  factors  (which  is 
proved  by  proving  each  of  the  four  factors  separately)  we  note  Thm.  17  on 
page  145  of  Birkhoff  and  MacLane:^ 

''Theorem  17.  The  intersection  S  r^  T  oi  two  subgroups  S  and  T  of  a 
group  G  is  a  subgroup  of  G." 

To  prove  this,  one  must  prove  that  S  r\  T  is  a  group,  and  the  definition 
of  a  group  G  is  the  logical  product  of  the  following  four  statements : 

(a)  If  X  and  y  are  in  G,  then  xy  is  in  G. 

(b)  If  X,  y,  and  z  are  in  G,  then  (xy)z  =  x(yz). 

1  From  Birkhoff  and  MacLane,  "A  Survey  of  Modern  Algebra,"  copjn-ight  1941  by 
The  Macmillan  Company  and  used  with  their  permission. 
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(c)  There  is  a  unit,  e,  in  G  such  that  for  each  x  in  G,  ex  =  x. 

(d)  If  e  is  a  unit  of  G  and  x  is  in  G,  then  there  is  an  inverse  x"^  inG  such 
that  x'^x  =  e. 

In  the  proof  of  their  Thm.  17  given  by  Birkhoff  and  MacLane,  (b)  is 
taken  as  obvious,  and  they  prove  (d),  (a),  and  (c). 

In  Thm.  179  on  page  147  of  Hardy  and  Wright,  the  conclusion  is  a  logical 
product  of  five  factors.    It  is  proved  by  proving  each  factor  separately. 

3.  The  most  fundamental  way  of  proving  "P  D  Q"  is  to  assume  the 
truth  of  "P"  and  then  correctly  deduce  the  truth  of  "Q".  This  is  a  very 
important  principle,  and  in  later  chapters  we  shall  give  considerable 
attention  to  it.  However,  w^e  have  to  pass  over  it  for  the  moment,  since  it 
is  beyond  the  scope  of  the  statement  calculus. 

Nevertheless,  the  statement  calculus  furnishes  a  number  of  useful  sub- 
sidiary methods  of  proving  "P  D  Q".  For  instance,  if  we  can  find  a 
statement  "R"  such  that  we  can  prove  "P  D  R"  and  "R  D  Q",  then  by 
Ex,  11.3.1(a),  we  can  infer  "P  D  Q".  An  obvious  generalization  is  to 
infer  "P  D  Q"  from  "P  D  R,",  "R,  D  R,",  ...  ,  "R„_,  D  R„",  and 
"R„  D  Q".  A  practical  application  of  this  will  be  noted  under  the  methods 
of  proving  "P  =  Q". 

4.  Another  way  of  proving  "P  D  Q"  is  to  prove  "^^Q  D  ^^P"  and 
make  use  of  Ex.  11.3.1(b).  The  proof  of  ''~Q  D  ^P"  might  be  carried 
out  in  the  fundamental  way,  namely,  by  assuming  "'^Q"  and  correctly 
deducing  "^^P".  However,  it  is  quite  immaterial  what  method  is  used  to 
prove  "-^Q  D  '^P'\    If  it  is  proved,  then  one  can  infer  "P  D  Q". 

This  means  of  proving  "P  D  Q"  is  widely  used,  and  we  quote  several 
instances.    The  first  is  paraphrased  from  page  41  of  Hardy  and  Wright. 

Theorem.     If  a  is  an  integer  and  2  divides  a^,  then  2  divides  a. 

Proof.  Assume  2  does  not  divide  a.  Then  a  has  the  form  2m  +  1- 
Hence  a^  is  4m^  +  4w  +  1.    Thus  2  does  not  divide  a^. 

The  next  is  quoted  exactly  from  page  7  of  Bocher,  1907.^ 

"Theorem  4.  If  the  product  of  two  or  more  pol3rnomials  is  identically 
zero,  at  least  one  of  the  factors  must  be  identically  zero. 

"For  if  none  of  them  were  identically  zero,  they  would  all  have  definite 
degrees,  and  therefore  their  product  would,  by  Theorem  3,  have  a  definite 
degree,  and  would  therefore  not  vanish  identically." 

The  next  is  taken  from  page  20  of  Fort,  1930.^ 

"Theorem  23.  Hypothesis:  ^  a„  diverges.  Conclusion:  ^  a„  di- 
verges,  k  being  any  fixed  integer." 

1  From  Bocher,  "Introduction  to  Higher  Algebra,"  copyright  1907  by  The  Macmillan 
Company  and  used  with  their  permission. 
^  From  Fort,  op.  cit. 
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The  proof  consists  of  assuming  that  ^  a„  converges  and  deducing  that 

CO  n-fc 

^  a„  converges.    The  details  are  irrelevant  for  the  present  discussion, 

n  =  0 

and  we  omit  them. 

Still  other  instances  of  proving  "P  D  Q"  by  proving  "~Q  D  ~P"  will 
appear  incidentally  in  connection  with  the  illustration  of  other  principles. 
We  remark  that  the  method  of  proving  "P  D  Q"  by  proving  "~Q  D 
r^P''  could  be  used  to  advantage  in  many  cases  which  are  now  handled  by 
reductio  ad  absurdum.  There  is  a  method  of  proving  "P  D  Q"  by  reductio 
ad  absurdum  that  practically  amounts  to  proving  "~Q  3  ~^";  and  one 
could  usually  simplify  the  proof  by  writing  it  as  a  proof  of  "~Q  D  ~P". 
We  shall  cite  some  specific  instances  in  connection  with  our  discussion  of 
reductio  ad  absurdum. 

5.  If  one  has  an  impHcation  of  the  special  form  ''PvQ  D  R"  to  prove, 
then  special  methods  are  available,  to  wit,  if  one  can  prove  "P  D  P"  and 
"Q  D  P",  then  by  Ex.  11.3.1(c)  one  can  infer  "PvQ  D  P". 

The  usual  method  of  proving  "PvQ  D  P"  amounts  to  this.  Thus  the 
outline  of  a  typical  proof  of  "PvQ  3  P"  might  run  in  words  somewhat  as 
follows: 

'Viven:     'PyQ'.  .  " 

"To  prove:     'R'. 

"Case  1.  'P'  is  true.  Then  .  .  .  (here  a  deduction  of  'R'  from  'P'  is 
given) . 

"Case  2.  'Q'  is  true.  Then  .  .  .  (here  a  deduction  of  'P'  from  'Q'  is 
given) . 

"But  since  we  are  given  that  one  of  'P'  or  'Q'  must  be  true,  'R'  must 
follow.    Q.E.D." 

The  reader  will  observe  that  "Case  1"  comprises  a  proof  of  "P  D  R" 
and  "Case  2"  comprises  a  proof  of  "Q  D  P",  and  the  proof  would  be 
shortened  and  simplified  if  one  merely  gave  these  proofs  of  "P  D  P"  and 
"Q  D  P"  and  then  cited  Ex.  11.3.1(c). 

Another  advantage  in  using  Ex.  11.3.1(c)  instead  of  the  verbal  "proof  by 
cases"  cited  above  is  the  greater  flexibility  permitted  by  Ex.  11.3.1(c).  In 
the  verbal  form,  one  is  committed  to  proving  "P  D  P"  by  assuming  "P" 
and  deducing  "P"  as  in  "Case  1"  and  to  proving  "Q  D  R"  by  assuming 
"Q"  and  deducing  "P"  as  in  "Case  2".  On  the  other  hand,  if  one  is  using 
Ex.  11.3.1(c),  it  suffices  that  each  of  "P  D  P"  and  "Q  D  P"  be  proved,  and 
the  method  of  proof  is  quite  irrelevant.  Thus,  for  instance,  one  might  prove 
"P  3  P"  by  reductio  ad  absurdum  and  "Q  3  P"  by  proving  "^R  3  ~Q". 
We  illustrate  the  use  of  Ex.  11.3.1(c)  with  the  proof  of  a  theorem  from 
plane  geometry.  Suppose  we  have  already  proved:  (a)  If  two  sides  of  a 
triangle  are  equal,  then  the  opposite  angles  are  equal,    (b)  If  two  sides  are 
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unequal,  then  the  opposite  angles  are  unequal,  and  the  angle  opposite  the 
greater  side  is  the  greater.  We  now  wish  to  prove  that,  if  two  angles  are 
unequal,  then  the  opposite  sides  are  unequal,  and  the  side  opposite 
the  greater  angle  is  the  greater  (see  Fig. 
II. 4. 2).  That  is,  in  the  notation  of  ele- 
mentary geometry: 

Given:    Angle  A  greater  than  angle  B. 

To  prove:     Side  a  greater  than  side  b. 

By  Ex.  11.3.1(b),  it  suffices  to  prove 
"If  a  is  not  greater  than  h,  then  A  is  not 
greater  than  B".  This  is  the  same  as  "If 
a  equals  6  or  a  is  less  than  h,  then  A  is  not 
greater  than  B".    So  we  now  wish  to  prove  "PvQ  D  R"  where: 

"P"  is  "a  equals  b". 

"Q"  is  "a  is  less  than  b". 

"R"  is  "A  is  not  greater  than  B". 

The  proof  proceeds  by  proving  "P  D  R"  and  "Q  D  R".  In  fact  our 
earher  proved  result  (a)  gives  "P  D  R"  since  if  "P"  then  "a  equals  b"  so 
that  "A  equals  B"  so  that  "A  is  not  greater  than  5"  so  that  "R",  and 
our  earlier  proved  result  (b)  gives  "Q  D  R"  since  if  "Q"  then  "a  is  less 
than  b"  so  that  "A  is  less  than  B"  so  that  "A  is  not  greater  than  B"  so 
that  "R".    Then  by  Ex.  11.3.1(c),  the  desired  theorem  follows. 

We  shall  refer  to  the  process  of  proving  "PvQ  3  R"  by  proving  each  of 
"P  D  R"  and  "Q  D  P"  as  "proof  by  cases."  We  note  the  generalization 
that  if  "Pi  D  Q",  "P,  D  Q",  ...,  "P„  D  Q",  then  "P1VP2V  •  •  •  vP„  D  Q". 

6.  An  interesting  special  case  of  the  above  is  where  w^e  can  prove  "P  D  Q" 
and  "'-P  D  Q".  Then  by  Ex.  11.3.1(c),  we  get  "Pw^P  D  Q".  However, 
we  have  "P\^P"  by  Ex.  II.3.1(q),  so  that  by  modus  ponens  we  get 
simply  "Q".  This  process  has  been  reduced  to  one  step  in  Ex.  11.3.1(d), 
which  says  that,  from  "P  D  Q"  and  "~P  D  Q",  we  get  "Q".  Use  of  this 
principle  is  also  called  "proof  by  cases." 

An  illustration  of  this  is  the  proof  given  on  pages  31  to  33  of  Bocher, 
1907,  of  the  theorem:^ 

"If  D'  is  the  adjoint  of  any  determinant  D,  and  M  and  M'  are  corre- 
sponding w-rowed  minors  of  D  and  D'  respectively,  then  M'  is  equal  to 
the  product  of  Z)'""^  by  the  algebraic  complement  of  M." 

The  proof  starts  out:^ 

"We  will  prove  this  theorem  first  for  the  special  case  in  which  the  minors 
M  and  M'  lie  at  the  upper  left-hand  corners  of  D  and  D'  respectively." 

When  the  proof  has  been  completed  for  this  case,  w^e  find  the  words :^ 

"Turning  now  to  the  case  in  which  the  minors  M  and  M'  do  not  lie  at 

^  From  Bocher,  op.  cit. 
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the  upper  left-hand  corners  of  D  and  D',  .  .  ."  and  a  proof  is  carried  out  for 
this  case  also. 

7.  Reductio  ad  absurdum.  There  are  a  number  of  minor  variations  of 
reductio  ad  absurdum,  and  we  shall  consider  several  of  the  more  common. 
However,  the  prototype  of  all  reductio  ad  absurdum  proofs  is  the  following. 
We  wish  to  prove  "P",  and  we  do  so  by  assuming  the  negation  "'-^P"  and 
deriving  a  contradiction.  This  being  absurd,  we  reject  the  possibility 
"'-'P"  and  conclude  "P'\ 

A  contradiction  is  any  statement  of  the  form  "Q'^Q".  If  we  can  derive  a 
contradiction  from  "^P",  this  means  we  can  prove  "^^P  D  Q'-^Q".  Then 
by  Ex.  11.3.1(e),  we  infer  "P". 

It  is  not  necessary  that  both  "Q"  and  "'^Q"  be  deduced  from  "'^P"  in 
order  that  we  deduce  a  contradiction  from  "^^P".  The  common  situation 
with  regard  to  reductio  ad  absurdum  is  that  "Q"  will  be  known  true  ahead 
of  time.  Then  assuming  "-^P"  we  deduce  "'~Q".  Combining  this  with 
the  known  result  "Q"  gives  our  contradiction  "Q'~^Q'\  and  we  infer  "P". 
As  an  illustration  of  this  we  note  a  portion  of  the  proof  of  Thm.  3  on  page  52 
of  Bocher,  1907.  Bocher  has  assumed  that  a  set  of  solutions  of  a  system  of 
homogeneous  linear  equations  of  rank  r  in  n  variables  forms  a  fundamental 
system  and  is  trying  to  conclude  that  they  are  n  —  r  in  number.  So  he 
assumes  that  they  are  not  n  —  r  in  number  and  succeeds  in  contradicting  a 
previously  proved  result  (Thm.  1  on  page  50  of  Bocher,  1907,  to  be  precise). 

In  the  situation  where  "Q"  is  known  ahead  of  time  and  one  proves  "P" 
by  reductio  ad  absurdum  by  assuming  "'^P"  and  deducing  "'^Q",  an 
alternative  proof  is  available,  as  follows.  Assume  "r^P"  and  deduce  "'^Q". 
This  proves  ''~P  D  -^Q"  from  which  follows  "Q  D  P".  However,  "Q"  is 
known,  so  that  we  infer  "P"  by  modus  ponens. 

In  an  extremely  rare  form  of  reductio  ad  absurdum,  one  can  deduce  "P" 
from  "~P".    In  this  case,  Ex.  11.3.1(g)  tells  us  that  we  can  infer  "P". 

8.  Proof  of  "'^P"  by  reductio  ad  absurdum.  This  is  a  minor  variation 
of  the  preceding.  We  assume  "P"  and  deduce  a  contradiction  "Q'-^Q'\ 
Then  by  Ex.  11.3.1(f),  we  infer  "'^P".  We  quote  the  following  illustration 
from  page  46  of  Hardy  and  Wright.^ 

"Theorem  47.     e  is  irrational. 

"If  e  —  m/n  and  k  >  n,  then  n\k\,  and 

1  _  1  _  _  1 

1!       2!        ■■■        k\ 


kWe  -  1 


is  an  integer.    But  it  is  equal  to 

'   +  ^  .  2,  . .,.  + .. ,  ..,. !  „ ,.  + 


k+\    '    (k+  l)(k  +  2)        {k+  l){k  +  2)(k  +  3) 

1  From  "An  Introduction  to  the  Theory  of  Numbers"  by  G.  H.  Hardy  and  E.  M. 
Wright,  copyright  1938  by  Clarendon  Press,  courtesy  of  Oxford  University  Press. 
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which  is  positive  and  less  than 


(/c  +  1)    '    (A;  +  1)^    '    {k  +  1)^    '  /.-  ' 

and  this  is  a  contradiction." 

As  in  the  preceding  type  of  reductio  ad  absurdum,  we  may  have  ''Q"  a 
known  result  and  deduce  "'-^Q"  from  "P".  There  is  also  the  very  rare 
form  in  which  we  deduce  "'^P"  from  "P".  By  Ex.  11.3.1(h),  we  can  infer 
"^'P"  in  such  a  case. 

As  "PvQ"  is  "~(~P~Q)"  a  proof  of  "PyQ"  by  reductio  ad  absurdum 
would  be  a  special  case  of  a  proof  of  "'^U"  by  reductio  ad  absurdum. 

Q.  Proof  of  "P  D  Q"  by  reductio  ad  absurdum.  One  can  approach  this 
in  two  different  ways  which  come  to  the  same  thing.  Since  "P  D  Q"  is 
"r^(P^Q)"^  one  can  assume  "P~Q"  and  try  to  deduce  a  contradiction. 
Alternatively,  one  can  assume  "P"  with  the  intention  of  deducing  "Q"  and 
then  elect  to  deduce  "Q"  by  reductio  ad  absurdum.  For  this,  one  assumes 
"~Q"  and  seeks  to  derive  a  contradiction.  The  net  result  in  this  case  is 
that  one  has  "P"  and  "'^Q"  assumed  and  is  trying  to  derive  a  contradic- 
tion. In  one  approach  we  have  "P^Q"  assumed;  in  the  other  we  have 
"P"  and  "'^Q"  assumed.  The  difference  is  insignificant.  So  we  charac- 
terize a  proof  of  "P  D  Q"  by  reductio  ad  absurdum  as  a  proof  in  which  one 
assumes  "P'-^Q"  and  tries  to  derive  a  contradiction  "R'^R".  If  one 
succeeds,  then  by  Ex.  II.3.1(i),  one  can 
infer  "P  D  Q".  We  quote  the  following 
illustration  from  Altshiller-Court,  1925, 
pages  65  to  66  (see  Fig. II. 4. 3).^ 

"Theorem.  If  two  internal  bisectors 
of  a  triangle  are  equal,  the  triangle  is 
isosceles. 

"Let  bisector  B  V  equal  bisector  CW.  If 
the  triangle  is  not  isosceles,  then  one  angle, 

say  B,  is  larger  than  the  other,  C,   and       B  ^^— ^  ^ 

from  the  two  triangles  BVC  and  BCW,  in  Fig.  II.4.3. 

which  BV  =  CW,  BC  =  BC,  and  angle  B 

is  greater  than  angle  C,  we  have  CV  greater  than  BW.  Now  through  V 
and  W  draw  parallels  to  BA  and  BV  respectively.  From  the  parallelogram 
BVGW  we  have  BV  =  WG  =  CW.  Hence  the  triangle  GWC  is  isosceles, 
andZ  (g  +  g')  =  Z(c  +  c').  But  Zg  =  Zh.  Hence  Z  (6  -f-  s'')  =  Z  (c  +  c') 
and  therefore  g'  is  smaller  than  c'.  Thus  in  the  triangle  GVC,  we  have  CV 
smaller  than  GV,  but  GV  =  BW.    Hence  CF  is  smaller  than  BW.    Conse- 

1  From  "College  Geometry"  by  N.  Altshiller-Court,  copyright  1925  by  Johnson 
Publishing  Co.,  courtesy  of  Barnes  &  Noble,  Inc. 
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quently,  the  assumption  of  the  inequahty  of  the  angles  B,  C  leads  to  two 
contradictory  results.    Hence  B  =  C,  and  the  triangle  is  isosceles." 

10.  Proof  of  "P  D  Q"  by  reductio  ad  absurdum.  We  note  a  special  case 
in  which  from  "P'^Q"  one  can  derive  "Q".  Since  one  can  also  derive  "'^Q" 
from  "P'^Q",  the  necessary  contradiction  is  forthcoming,  and  we  have  a 
proof  of  "P  D  Q"  by  reductio  ad  absurdum.  Alternatively,  one  may  simply 
apply  Ex.  II.3.1(j).    An  illustration  is  afforded  by  the  following  proof. 

"If  a  <  b  -\-  c  whenever  c  >  0,  then  a  <  h. 

"Proof.  Assume  that  a  <  b  -{-  c  whenever  c  >  0,  and  also  assume 
a  >  b.    Then  (a  -  &)/2  >  0.    Take  this  to  be  c.    Then 

a  < 

2a  <  2b  -}-  a  -  h, 
a  <  6." 

11.  Proof  of  "P  D  Q"  by  reductio  ad  absurdum.  Another  special  case  is 
that  in  which  from  "P^Q"  one  can  derive  "'^P".  One  can  also  derive 
"P"  from  "P~Q"  and  so  get  "P  D  Q"  by  reductio  ad  absurdum,  or  one 
may  simply  apply  Ex.  II.3.1(k).  An  illustration  of  this  type  of  proof  is 
the  following  (see  Fig.  II. 4.4  and  Fig.  II. 4.5). 


"If  two  lines  in  a  plane  are  cut  by  a  transversal  and  alternate  interior 
angles  are  equal,  then  the  lines  are  parallel. 

"Given:     a  =  ^. 

"To  prove:     AC  parallel  to  5D. 

"Suppose  they  are  not  parallel,  so  that  AC  and  BD  meet,  say  at  E.  Lay 
off  BF  equal  to  AE.    Then  triangles  ABE  and  BAF  are  congruent  (two 
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sides  and  the  included  angle  equal).  Thus  /?  =  Z  BAF  which  is  less  than  a. 
So  (3  ^  a." 

Strictly  speaking,  this  proof  is  not  complete,  being  only  one  case  of  a 
proof  by  cases.  The  other  case,  when  AC  meets  BD  on  the  other  side  of 
B,  can  easily  be  carried  out  in  an  analogous  fashion. 

In  many  of  the  cases  where  one  assumes  "P^-^Q"  and  deduces  "'^P", 
no  use  is  made  of  "P"  in  the  deduction.  In  other  words,  "^^P"  is  de- 
duced from  "'^Q"  alone.  In  such  cases  we  have  our  choice  of  an  alternative 
method  of  proof,  namely,  not  to  bother  assuming  "P",  but  merely  to  write 
out  the  deduction  of  "'^P"  from  "'^Q".  This  constitutes  a  legitimate  proof 
of  "'^Q  D  '^P",  from  which  we  get  "P  D  Q"  immediately  (see  above,  or 
refer  to  Ex.  11.3.1(b)). 

This  alternative  method  seems  more  elegant  to  us  than  the  reductio  ad 
absurdum  proof,  but  as  both  are  quite  valid  it  is  certainly  a  matter  of  taste 
which  one  uses. 

An  illustration  of  a  proof  of  "P  D  Q"  by  reductio  ad  absurdum  in  which 
one  assumes  "P'^Q"  and  deduces  "'^P"  but  makes  no  use  of  "P"  in  the 
deduction  is  to  be  found  in  the  proof  of  Satz  1  on  page  3  of  Landau,  1930. 
On  page  2,  Landau  has  assumed  Axiom  4,  to  wit:^ 

"Aus  x'  =  y'  folgt  X  =  y." 
That  is, 

"x'  ^  y'  D  X  =  y." 

Then  by  Ex.  11.3.1(b),  one  has  immediately 

"x  9^  y  D  x'  ^  y'." 

However,  Landau  chooses  to  prove  this  by  reductio  ad  absurdum.  We 
quote  :^ 

"Satz  1 :     Aus  x  9^  y  folgt  x'  7^  y' . 

"Beweis:     Sonst  ware  x'  —  y',  also  nach  axiom  4,  x  =  2/." 

12.  Proofs  of  "PyQ".  These  are  fairly  uncommon,  and  no  particular 
method  is  widely  used.  Ex.  11.3.1(1),  (m),  (n),  give  three  principles  which 
could  furnish  proofs  of  ''PvQ".  The  first  is  a  special  case  of  "proof  by 
cases"  which  we  mentioned  earlier.  We  now  illustrate  each  of  Ex.  11.3.1(1), 
(m),  and  (n)  in  the  order  named. 

First  illustration. 

"If  each  a„  is  positive,  then  either  ^  a„  converges  or  else  it  diverges  to 

n  =  0 

^  From  "Grundlagen  der  Analysis"  by  Edmund  Landau,  quoted  by  special  license 
from  the  Department  of  Justice,  Office  of  Alien  Property. 
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N  -V 

"Proof.    If  ^  a„  is  bounded,  then  X^  a„  is  a  bounded,  monotone  sequence 

n=0  n=0 

and  has  a  limit.    Hence  ^  a„  converges.    If  ^  a„  is  unbounded,  then  one 

n  =  0  n  =  0 

readily  concludes  that  XI  ^«  diverges  to  +  <» ." 

n  =  0 

For  a  slight  variation  of  this,  see  Hardy,  1947,  page  137. 

Second  illustration. 

"If  a  is  a  least  upper  bound  of  E,  then  either  a  is  in  £"  or  a  is  a  limit  point 
of  ^. 

"Proof.  Let  a  not  be  in  E.  Since  a  is  a  least  upper  bound  of  E,  for 
every  e  >  0  there  is  an  x  in  E  with  a  —  e  <  x  <  a.  Since  a  is  not  in  E,  x 
must  be  different  from  a.    So  by  definition,  a  is  a  limit  point  of  E." 

Third  illustration. 

This  is  taken  from  the  proof  of  Thm.  45,  page  41  of  Hardy  and  Wright.^ 

"Theorem  45.     If  x  is  a  root  of  an  equation 

x'"  +  Cix"""^  +  •  •  •  +  c,„  =  0, 

with  integral  coefficients  of  which  the  first  is  unity,  then  x  is  either  integral 
or  irrational. 

"If  re  =  a/b,  where  {a,b)  =  1,  then 

a'"  +  Cia'"~^b  +  •  •  •  +  c„?>"'  =  0. 

Hence  b\a'".  So  if  b  has  any  prime  factor  p,  it  must  divide  a  also,  contra- 
dicting (a,b)  =  1.    So  6  has  no  prime  factors,  and  so  b  —  1,  x  =  a." 

Note  that  the  proof  that  b  has  no  prime  factors  proceeds  by  reductio  ad 
absurdum. 

13.  Proof  of  "P  ^  Q".  Since  "P  ^  Q"  is  "(P  D  Q)(Q  D  P)",  it  suflSces 
to  prove  the  two  implications  "P  D  Q"  and  "Q  D  P".  In  view  of  the 
multiplicity  of  ways  of  proving  "P  D  Q",  there  is  an  even  greater  multi- 
phcity  of  ways  of  proving  "P  =  Q". 

The  most  obvious  and  direct  way  is  to  assume  "P"  and  deduce  "Q", 
thus  proving  "P  D  Q",  and  then  assume  "Q"  and  deduce  "P",  thus  proving 
"Q  D  P".  A  common  alternative  is  to  assume  "P"  and  deduce  "Q",  and 
then  assume  "-^P"  and  deduce  "~Q".  This  latter  proves  "~P  D  -Q", 
from  which  "Q  D  P"  follows.  HoAvever,  many  other  possibilities  present 
themselves.  Thus,  to  cite  one  of  the  possibilities,  one  might  prove  both 
"P  D  Q"  and  "Q  D  P"  by  reductio  ad  absurdum. 

We  illustrate  the  three  possibilities  which  we  have  specifically  mentioned. 
In  Bocher,  1907,  page  36,  Ave  find  the  theorem:^ 

1  From  Hardy  and  Wright,  op.  cit. 

2  From  Bocher,  op.  cii. 
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"A  necessary  and  sufficient  condition  for  the  linear  dependence  of  the 

m  sets 


(z  =  1,  2,  .  .  .  ,  m) 


of  n  constants  each,  when  w  <  n,  is  that  all  the  7n-rowed  determinants  of 
the  matrix 


[m] 


Xn 


should  vanish." 

To  prove  necessity,  Bocher  assumes  that  the  m  sets  of  constants  are 
linearly  dependent,  and  shows  that  all  the  m-rowed  determinants  vanish. 
To  prove  sufficiency,  Bocher  assumes  that  all  m-rowed  determinants 
vanish  and  shows  that  the  m  sets  of  constants  are  linearly  dependent. 

This  is  then  a  clear-cut  instance  of  proving  "P  =  Q"  by  assuming  "P" 
and  deducing  "Q",  and  by  assuming  "Q"  and  deducing  "P". 

We  turn  now  to  an  illustration  of  the  proof  of  "P  =  Q"  by  assuming  "P" 
and  deducing  "Q",  and  by  assuming  "r^P"  and  deducing  "^^Q^'. 

Consider  the  discussion  on  page  74  of  Wentworth  and  Smith^  of  how  to 
"prove  that  a  certain  line  or  group  of  lines  is  the  locus  of  a  point  that  fulfills 
a  given  condition  .  .  .  ."  Their  instructions  are  that  one  should  prove  two 
things  :^ 

"1.  That  any  point  in  the  supposed  locus  satisfies  the  condition. 

"2.  That  any  point  outside  the  supposed  locus  does  not  satisfy  the  given 
condition." 

Let  "P"  be  a  translation  of  ''the  point  A  is  in  the  supposed  locus"  and 
let  "Q"  be  a  translation  of  "the  point  A  satisfies  the  given  condition". 
Then  part  1  of  Wentworth  and  Smith's  instructions  says  that  we  should 
prove  "P  D  Q"  and  part  2  says  that  we  should  prove  "-^P  D  ^^Q". 

From  these  two,  one  can  certainlj^  infer  "P  =  Q".  However,  "P  =  Q" 
says  that  being  on  the  supposed  locus  is  equivalent  to  satisfying  the  given 
condition,  and  so  the  supposed  locus  must  indeed  be  the  correct  one,  as  we 
wished  to  prove. 

This  is  illustrated  in  the  following  proof,  which  we  condense  from  page  75 
of  Wentworth  and  Smith.  ^ 

1  From  "Plane  and  Solid  Geometry"  by  George  Wentworth  and  D.  E.  Snoith,  copy- 
right 1913,  courtesy  of  Ginn  &  Company. 
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Theorem.     The  locus  of  a  point  equidistant  from  the  extremities  of  a 
given  line  is  the  perpendicular  bisector  of  that  line  (see  Fig.  II. 4.6). 
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Fig.  II.4.6. 

Given:     YO,  the  perpendicular  bisector  of  the  line  AB. 

To  prove:     That  YO  is  the  locus  of  a  point  equidistant  from  A  and  B. 

Proof.  Let  P  be  any  point  in  YO,  and  C  any  point  not  in  YO.  Draw  the 
lines  PA,  PB,  CA,  and  CB.  Since  AO  =  BO  and  OP  is  common  to  both, 
the  right  triangles  AOP  and  BOP  have  two  sides  and  the  included  angle 
equal,  and  hence  are  congruent.  So  PA  =  PB.  Let  CA  cut  YO  at  D,  and 
draw  DB.    Then,  as  above,  DA  =  DB.    So 

CA  =  CD  -^  DB  >  CB 

since  the  straight  line  CB  is  the  shortest  distance  between  the  two  points 
C  and  B.    So  CA  y^  CB.    Therefore,  YO  is  the  required  locus. 

We  now  cite  a  proof  of  "P  ^  Q"  in  which  one  proves  each  of  "P  D  Q" 
and  "Q  D  P"  hy  reductio  ad  absurdum. 

On  page  152  of  B6cher,  1907,  we  find  the  theorem:^ 

"A  necessary  and  sufficient  condition  that  a  real  quadratic  form  be 
definite  is  that  it  vanish  at  no  real  points  except  its  vertices  and  the  point 
(0,  0,  ...  ,  0)." 

In  proving  this,  both  "P  D  Q"  and  "Q  D  P"  are  proved  by  reductio  ad 
absurdum.  The  details  are  sufficiently  complicated  that  it  is  pointless  to 
reproduce  them.  It  happens  that  both  these  proofs  by  reductio  ad  absur- 
dum are  of  the  sort  discussed  earlier  where  one  assumes  "P"  and  "'^Q"  and 
deduces  "'^P"  by  use  of  "'~Q"  only.  Thus  with  slight  rewriting,  this 
proof  could  be  thrown  into  the  form  where  one  proves  "P  =  Q"  by  assuming 
"^^Q"  and  deducing  "'^P",  and  assuming  "^^P"  and  deducing  "'^Q". 

14.  A  situation  which  sometimes  arises  is  where  three  or  more  statements 
are  to  be  proved  mutual  necessary  and  sufficient  conditions  for  each  other. 

^  From  Bocher,  op.  cit. 
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Say  there  are  four  statements,  "P",  "Q",  "P",  and  ">S",  and  we  wish  to 
prove  the  six  equivalences  "P  ^  Q",  "^  ^  ^",  "P  =  'S",  "Q  ^  P", 
"Q  =  ^",  and  "R  =  S".  This  means  that  we  have  twelve  implications  to 
prove.  However  it  will  suffice  to  prove  a  complete  "cycle"  of  them,  say 
"P  D  Q",  "Q  D  R",  "R  D  S",  and  ";S  D  P",  since  the  other  eight  can  be 
deduced  from  these  four  by  repeated  use  of  the  principle  that,  if  "P  D  Q" 
and  "Q  D  R"  are  proved,  then  "P  D  R"  is  proved.  By  choosing  "P", 
"Q",  "R",  and  "*S"  in  a  judicious  order,  one  can  often  arrange  it  so  that 
none  of  the  implications  of  the  cycle  "P  D  Q",  "Q  D  R",  "R  D  S",  and 
"S  D  P"  is  very  difficult  to  prove. 

For  an  illustration,  we  turn  to  pages  10  to  11  of  Stone,  1932.   We  quote  :^ 

"Theorem  1.9.  The  five  following  assertions  concerning  the  ortho- 
normal  set  {4>„\  are  equivalent : 

"(1)    {(j)„}  is  complete; 

"(2)   (f,(j)n)  =  0  for  every  n  implies  /  =  0; 

"(3)  the  closed  linear  manifold  determined  by  {0„}  is  §; 

"(4)  for  every  /  in  ^, 

/  =   XI  a„(/)„,         a„  =  (/,0„); 

a  =  l 

"(5)  for  every  pair  /,  g  in  ^,  the  Parseval  identity 

(f,g)     =      Hciaba,  fln     =     if  ,(f>n)  ,  K     =     {q  ,4>^ 

a  =  l 

is  true. 

"We  shall  show  that  the  following  inferences  are  possible: 

l_,2-^3->4^5^1, 

each  arrow  being  directed  from  hypothesis  to  conclusion.  The  equivalence 
of  the  five  assertions  is  then  obvious." 

Stone  then  proceeds  with  the  proofs  of  the  five  implications,  the  details 
of  which  do  not  concern  us  here. 

Another  illustration,  in  which  "P  =  Q  =  P"  is  proved  by  proving  each 
of  "P  D  Q",  "Q  D  P",  and  "P  D  P"  is  to  be  found  in  the  proof  of  the 
theorem  on  page  28  of  Halmos,  1942.  Another  illustration  with  an  unusual 
twist  to  it  appears  in  Boas  and  Pollard. 

15.  In  dealing  with  equivalences,  one  takes  it  for  granted  that,  if 
"P  =  Q",  then  "Q  =  P".  This  follows  from  Ex.  II.3.1(o).  Another 
principle  which  is  sometimes  used  is  that,  if  "P  =  Q"  and  "Q  =  P",  then 

1  From  "Linear  Transformations  in  Hilbert  Space  and  Their  Application  to  Analysis" 
by  Marshall  H.  Stone,  published  1932  by  the  American  Mathematical  Society,  quoted 
by  permission. 
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"P  =  R".  This  follows  from  Ex.  II.3.1(p).  It  is  illustrated  on  page  3  of 
Bocher,  1907,  by  the  proof  of:' 

"Theorem  5.  A  necessary  and  sufficient  condition  that  two  polynomials 
in  X  be  identically  equal  is  that  they  have  the  same  coefficients." 

Bocher  says  that  the  proof  follows  from  the  two  statements :' 

"Theorem  4.  A  necessary  and  sufficient  condition  that  a  polynomial  in 
X  vanish  identically  is  that  all  its  coefficients  be  zero." 

"...  two  polynomials  in  x  are  identically  equal  when  and  only  when 
their  difference  vanishes  identically,  ..." 

16.  One  often  hears  mentioned  the  converse  of  a  statement.  However, 
the  notion  of  the  converse  of  a  statement  is  not  clearly  defined  in  general. 
When  the  statement  is  a  simple  one  of  the  form  "P  D  Q",  there  is  essential 
agreement  as  to  what  the  converse  is.  Some  people  say  that  it  is  "Q  D  P" 
while  others  say  that  it  is  "^P  D  '^Q",  but  since  each  of  these  follows  from 
the  other,  there  is  no  appreciable  distinction  between  them.  So  we  may 
say  that,  if  we  have  a  simple  statement  of  the  form  "P  D  Q",  then  its 
converse  is  whichever  of  "Q  D  P"  or  "~P  D  ~Q"  is  convenient  to  deal 
with. 

THE  CONVERSE  OF  A  TRUE  THEOREM  IS  NOT  NECESSARILY 
TRUE! 

Or  necessarily  false  either,  for  that  matter. 

Another  thing  to  note  is  that,  if  two  statements  are  equivalent,  it  is  not 
necessarily  true  that  their  converses  are  equivalent.  Thus  "P  D  (Q  D  P)" 
and  "PQ  D  R"  are  equivalent,  but  their  converses  "(Q  D  R)  D  P"  and 
"R  D  PQ"  are  not  equivalent.  Thus  we  see  that  by  making  quite  insig- 
nificant changes  in  the  form  of  a  statement  we  can  induce  great  changes 
in  its  converse. 

Still  further  complications  arise  when  one  is  considering  the  converse  of 
a  more  elaborate  statement.  When  "P"  or  "Q"  are  elaborate,  the  converse 
of  "P  D  Q"  is  often  not  taken  to  be  either  of  "Q  D  P"  or  "^P  D  ~Q", 
but  some  other  statement  not  equivalent  to  these.  One  form  of  statement 
for  which  this  commonly  occurs  is  "P  D  {Q  D  P)",  for  which  the  converse 
is  rather  often  taken  to  be  "P  D  (R  D  QY'  or  "P  D  (-Q  D  '-P)". 
Rather  commonly  this  is  done  when  P  is  some  fairly  trivial  condition. 
Thus  consider  Fort's  Thm.  23  which  we  cited  above.    Fort's  statement  of  it 

•       2 

is: 

"Theorem  23.     Hypothesis:    ^  a„  diverges.     Conclusion:     ^  a„  di- 

n=0  n=k 

verges,  k  being  any  fixed  integer." 

Clearly  this  is  of  the  form  "P  D  (Q  D  R)",  where  "P",  "Q",  and  "P" 

1  From  Bocher,  op.  cit. 

2  From  Fort,  op.  cit. 
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denote  respectively  "k  is  a  fixed  (positive)  integer",  "XI  ct„  diverges",  and 

00  n  =  0 

"^  a„  diverges". 

n  =  t 

After  the  proof  of  this  Thm.  23,  Fort  states  that  the  converse  is  readily 
proved.  Quite  clearly,  he  intends  either  "P  D  ('^Q  ^  ^^R)"  or 
"F  D  (P  D  Q)"  as  the  converse  and  would  not  for  a  moment  consider 
"(Q  D  R)  D  P"  as  the  converse. 

As  another  instance  of  taking  "P  D  (R  D  Q)"  to  be  the  converse  of 
"P  D  (Q  D  R)",  we  quote  from  pages  100  to  101  of  Wentworth  and  Smith. 
On  page  100  we  find  stated^  "Proposition  VII.  In  the  same  circle  or  in 
equal  circles,  if  two  chords  are  unequal,  they  are  unequally  distant  from 
the  center,  and  the  greater  chord  is  at  the  less  distance."  On  page  101  we 
find  stated^  "Proposition  VIII.  In  the  same  circle  or  in  equal  circles,  if  two 
chords  are  unequally  distant  from  the  center,  they  are  unequal,  and  the 
chord  at  the  less  distance  is  the  greater."  Below  Proposition  VIII  is  the 
statement  that  it  is  the  converse  of  Proposition  VII.  Here  "In  the  same 
circle  or  in  equal  circles"  plays  the  role  of  "P",  "chords  are  unequal"  plays 
the  role  of  "Q",  and  "distances  from  the  center  are  unequal"  plays  the 
role  of  "P". 

Another  situation  which  sometimes  arises  is  that  in  which  the  converse 
of  "PQ  D  P"  is  taken  to  be  "PR  D  Q"  rather  than  "P  D  PQ".  Since 
"PQ  D  P"  is  equivalent  to  "P  D  (Q  D  P)"  and  "PR  D  Q"  is  equivalent 
to  "P  D  (R  D  Q)",  this  is  really  a  variant  of  the  previous  case  where  the 
converse  of  "P  3  (Q  D  P)"  was  taken  to  be  "P  D  (P  D  Q)". 

Other  interesting  situations  can  be  found.  Thus  in  Agnew,  1942,  page 
113,  is  to  be  found  :^ 

"Theorem  6.65.  If  two  differentiable  functions  are  linearly  dependent 
over  the  interval  /,  then  their  Wronskian  vanishes  over  the  interval  7." 

This  seems  to  be  simply  of  the  form  "P  D  Q",  and  the  corresponding 
statement  of  the  form  "Q  D  P"  would  appear  to  be:^ 

"If  their  Wronskian  vanishes  over  the  interval  7,  then  two  differentiable 
functions  are  linearly  dependent  over  the  interval  I." 

Hence  it  would  seem  appropriate  to  take  the  second  statement  as  the 
converse  of  the  first.  Agnew  does  so  and  goes  on  to  make  the  point  that, 
whereas  the  first  statement  is  true,  its  converse  is  false. 

The  interesting  point  here  is  that,  although  superficially  the  two  state- 
ments appear  to  have  the  forms  "P  D  Q"  and  "Q  D  P",  this  is  an  illusion 
due  to  the  peculiarities  of  English  grammar.  Actually,  if  we  let  "P",  "Q", 
and  "P"  denote  "/  and  g  are  differentiable",  "/  and  g  are  linearly  dependent 

1  From  Wentworth  and  Smith,  op.  cit. 

2  From  "Differential  Equations"  by  R.  P.  Agnew,  copyright  1942,  courtesy  of 
McGraw-Hill  Book  Company,  Inc. 
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over  the  interval  7",  and  "the  Wronskian  of  /  and  g  vanishes  over  the 
interval  /",  then  the  first  statement  is  translated  by  "PQ  D  R",  whereas  the 
second  statement  is  translated  by  "R  J  (P  D  Qy\  As  "R  D  (P  D  Q)" 
is  not  equivalent  to  "R  D  PQ",  we  have  here  another  instance  of  an  un- 
orthodox converse.  Actually,  since  "PQ  D  R"  is  equivalent  to  "P  D 
(Q  D  Ry\  and  ''R  D  {P  D  Q)"  is  equivalent  to  "P  D  {R  D  Q)",  we  have 
here  essentially  another  instance  in  which  "P  D  (Q  D  R)"  and  "P  D 
{R  D  Q)"  are  taken  to  be  converses.  However,  this  instance  has  the 
peculiarity  that,  as  far  as  the  English  versions  of  the  statements  are  con- 
cerned, the  two  statements  appear  to  have  the  forms  "P  D  Q"  and 
"Q  D  P". 

One  can  find  still  more  unorthodox  instances  of  the  notion  of  a  converse. 
As  these  occur  only  occasionally,  and  with  considerable  variations,  there 
seems  no  point  in  cataloguing  them.  For  the  reader  who  is  curious  to  see 
one  such,  we  note  that  a  rather  startling  one  occurs  in  the  proof  of  Thm.  17, 
on  page  25  of  Birkhoff  and  MacLane. 

5.  Summary  of  Logical  Principles.  We  summarize  the  logical  principles 
which  were  discussed  and  illustrated  in  the  preceding  section. 

1.  Modus  ponens.    If  "P"  and  "P  D  Q'%  then  "Q". 

2.  If  "P"  and  "Q",  then  "PQ". 

3.  If  "P  D  Q"  and  "Q  D  R",  then  "P  D  R". 

4.  If  "^Q  D  ~P",  then  "P  D  Q". 

5.  Proof  by  cases.    If  "P  D  P"  and  "Q  D  P",  then  "PyQ  D  R". 

6.  Proof  by  cases.    If  "P  D  Q"  and  ''-'P  D  Q",  then  "Q". 

7.  Proof  of  "P"  by  reductio  ad  absurdum.    Assume  "'^P"  and  deduce 

8.  Proof  of  "'^P"  by  reductio  ad  absurdum.    Assume  "P"  and  deduce 

"Q^Q". 

9.  Proof  of  "P  D  Q"  by  reductio  ad  absurdum.  Assume  "P'^Q"  and 
deduce  ''R'^R". 

10.  Proof  of  "P  D  Q"  by  reductio  ad  absurdum.  Assume  "P'^Q"  and 
deduce  "Q". 

11.  Proof  of  "P  D  Q"  by  reductio  ad  absurdum.  Assume  "P'^Q"  and 
deduce  "'-P". 

12.  Miscellaneous  proofs  of  "PvQ".    See  Ex.  11.3.1(1),  (m),  and  (n). 

13.  Proof  of  "P  =  Q".    Prove  "P  D  Q"  and  "Q  D  P". 

14.  Proof  of  "P  ^  Q  ^  P  =  S".  Prove  "P  D  Q",  "Q  D  R",  "R  D  S", 
and  "5  D  P". 

15.  Proof  of  "P  =  Q".    Prove  ''P  =  P"  and  "Q  =  R'\ 

EXERCISES 

II.5.1.  Find  illustrations  in  the  mathematical  literature  of  at  least  six 
of  the  fifteen  principles  listed  above. 
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11.5.2.  Find  an  illustration  in  the  mathematical  literature  of  a  true 
theorem  "P  D  Q"  whose  converse  "Q  3  P"  is  false. 

11.5.3.  On  page  21  of  Fort,  1930,  appears  the  theorem:^ 

'^Hypothesis:    ^  |a„|  converges. 

71  =  0 

Conclusion:       ^  a„  converges  and  |   XI  ^"  I  ^   ^  kn|-" 

n  =  0  ri  =  ()  n  =  0 

Without  looking  up  the  proof,  state  one  of  the  fifteen  principles  above 
which  will  be  used  in  the  proof. 

11.5.4.  On  page  32  of  the  present  text  occurs  the  statement:  "AB  and 
CD  never  meet,  for  if  they  met,  the  point  at  which  they  meet  would  be  a 
point  at  which  planes  EF  and  GH  meet."  Which  of  the  fifteen  principles 
does  this  illustrate? 

11.5.5.  Tell  which  of  the  fifteen  principles  is  used  in  the  following  proof 
taken  from  page  38  of  Fort,  1930,  in  which  it  is  assumed  that  the  a's  are  all 
positive.^ 

"Theorem  49.     Hypothesis:  X^  a„  diverges. 

n=l 

Conclusion:  Xl     , ' — ; diverges. 

T^x     ai  +  •  •  •  +  a„ 


'Now  suppose 


X! ; ' — , convergent.    Then 

;f-f  ai  +  •  •  •  +  a„ 


0 


.=n+i  ai  +  •  •  •  +  a, 

when  n  -^  00 .    Consequently  it  is  possible  to  choose  an  m  such  that  when 
n  >  m, 


ai  +  •  •  ■  +  a„ 


<  -  for  example. 


fli  +  •  •  •  +  an+ 
"But  for  any  fixed  n  when  p  -^oo 

tti  +  •  •  •  +  a„ 


1  - 


fll    +    •  •  •    +   Ann 


1,  a  contradiction." 


II.5.6.  Tell  which  of  the  fifteen  principles  is  used  in  the  following  proof 
(see  Figs.  II. 5.1  to  II. 5.3)  condensed  from  pages  118  to  119  of  Wentworth 
and  Smith.  ^ 

An  inscribed  angle  is  measured  by  half  the  intercepted  arc. 

1  From  Fort,  op.  cit. 

'^  From  Wentworth  and  Smith,  op.  cit. 
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Given :  A  circle  with  the  center  0  and  the  inscribed  angle  B,  intercepting 
the  arc  AC. 

To  prove:     That  ZB  is  measured  by  half  the  arc  AC. 

Case  1.     When  0  is  on  one  side,  as  AB  (Fig.  II.5.1). 

Proof.  Draw  OC.  Then  ZAOC  equals  ZB -{-  ZC.  But  ZjB  =  ZC. 
So  Z  5  =  I  ZAOC.  As  ZAOC  is  measured  by  arc  AC  (previous  theorem), 
we  conclude  that  ZB  is  measured  by  half  arc  AC. 

Case  2.     When  0  hes  within  the  angle  B  (Fig.  II. 5.2). 

Proof.  Draw  the  diameter  BD.  By  Case  1,  Z  ABD  is  measured  by  half 
arc  AD  and  Z  DBC  is  measured  by  half  arc  DC.  Adding  gives  Z  B  meas- 
ured by  half  arc  AC. 

Case  3.     When  0  lies  outside  the  angle  B  (Fig.  II. 5.3). 

Proof.  Draw  the  diameter  BD.  By  Case  1,  ZCBD  is  measured  by 
half  arc  CD  and  ZABD  is  measured  by  half  arc  AD.  Subtracting  gives 
ZB  measured  by  half  arc  AC    Q.E.D. 

11.5.7.  Find  a  case  in  the  mathematical  literature  where  the  word 
''converse"  is  used  in  some  peculiar  fashion,  such  as  taking  "P  D  (R  D  Q)" 
to  be  the  converse  of  "P  D  (Q  D  RY'  or  taking  "PR  D  Q"  to  be  the 
converse  of  "PQ  D  R". 

11.5.8.  Identify  the  logical  principle  in  the  following  quotation  from 
"Through  the  Looking-glass"  by  Lewis  Carroll: 

"  'Everybody  that  hears  me  sing  it — either  it  brings  the  tears  into  their 

eyes,  or  else ' 

"  'Or  else  what?'  said  Alice,  for  the  Knight  had  made  a  sudden  pause. 
"  'Or  else  it  doesn't,  you  know.'  " 


CHAPTER  III 
THE  USE  OF  NAMES 

"The  name  of  the  song  is  called  'Haddocks^  Eyes.''  " 

"Oh,  that's  the  name  of  the  song,  is  it?"  Alice  said,  trying  to  feel  interested. 

"No,   you  don't  understand,"   the  Knight  said,   looking  a  little  vexed. 
"That's  what  the  name  is  called.    The  name  really  is,  'The  Aged  Aged  Man.'  " 

"Then  I  ought  to  have  said  'That's  what  the  song  is  called'?"  Alice  cor- 
rected herself. 

"No,  you  oughtn't:  that's  quite  another  thing!    The  song  is  called  'Ways 
and  Means':  but  that's  only  what  it's  called,  you  know!" 

"Well,  what  is  the  song,  then?"  said  Alice,  who  was  by  this  time  completely 
bewildered. 

"I  was  coming  to  that,"  the  Knight  said.  "The  song  really  is  'A-sitting  On  A 
Gate':  and  the  tune's  my  own  invention." 

Lewis  Carroll 

A  statement  about  something  generally  contains  a  name  of  that  thing, 
but  it  must  not  contain  the  thing  itself. 

Applied  to  natural  objects,  this  seems  quite  obvious,  since  in  such  case 
the  statement  usually  could  not  contain  the  thing  itself.  Consider  the 
statement  "Georgia  is  a  southern  state."  This  contains  the  word 
"Georgia/'  which  is  a  name  of  the  state  in  question.  Clearly  it  would  be 
impracticable  to  replace  the  word  "Georgia"  in  this  statement  by  the  state 
itself. 

Similar  considerations  apply  to  "The  moon  is  made  of  green  cheese," 
"The  Atlantic  Ocean  is  wet,"  "The  Equator  is  long,"  etc. 

For  small  objects,  these  considerations  are  not  quite  so  conclusive.  In 
the  statement  "This  thumbtack  is  round,"  one  could  conceivably  erase  the 
words  "This  thumbtack"  and  in  the  empty  space  stick  the  thumbtack  into 
the  page.  One  can  make  rather  cogent  objections  that  the  resulting  con- 
glomeration of  two  words  and  a  thumbtack  is  not  a  statement,  but  a  rebus 
or  charade.  In  any  case,  any  other  statement  about  that  particular  thumb- 
tack positively  could  NOT  contain  the  thumbtack,  since  the  thumbtack 
has  now  been  preempted  to  appear  in  the  particular  place  indicated. 
Further,  and  for  the  same  reason,  if  one  wished  to  repeat  the  same  statement 
the  repetition  could  not  contain  the  thumbtack  but  must  contain  some 
name  of  the  thumbtack,  such  as  the  words  "This  thumbtack." 

For  these  and  other  reasons,  it  is  generally  agreed  that,  although  one 
could  put  together  a  combination  consisting  of  a  thumbtack  followed  by 
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the  words  "is  round,"  and  although  this  combination  would  doubtless  con- 
vey information,  nevertheless  the  combination  does  not  constitute  a 
statement. 

This  seems  fair  enough.  Certainly  not  all  means  of  conveying  informa- 
tion are  necessarily  statements.  If  a  policeman  at  a  busy  intersection 
waves  us  to  stop,  he  has  certainly  conveyed  information,  but  hardly  in  the 
form  of  a  statement. 

To  summarize,  it  is  generally  agreed  about  statements  that  a  statement 
about  something  must  contain  a  name  of  that  thing,  rather  than  the  thing 
itself.    We  shall  conform  with  this  usage. 

Throughout  the  present  text,  we  are  repeatedly  making  statements  about 
other  statements,  and  by  the  above  dictum  the  statements  which  we  make 
must  contain  names  of  the  statements  about  which  we  are  speaking.  For  a 
word  or  a  statement,  one  standard  procedure  for  constructing  a  name  is  to 
enclose  the  word  or  statement  in  quotation  marks.  Thus  one  writes: 
"Georgia  is  a  southern  state"  contains  "Georgia." 

This  is  a  statement  about  the  statement  "Georgia  is  a  southern  state"  and 
the  word  "Georgia,"  and  so  contains  names  of  the  statement  and  the  word, 
to  wit,  the  statement  and  the  word  together  with  surrounding  quotation 
marks. 

If  one  wishes  to  talk  about  a  name  of  a  statement  or  of  a  word,  one  must 
use  a  name  of  this  name.     This  becomes  rather  awkward  as  in  the  final 
words  of  the  preceding  paragraph,  or  in  the  statement : 
The  name  of  "Georgia"  is  "  'Georgia'." 

In  writing  the  above  statement,  we  are  making  a  statement  about  a  word 
and  a  name  of  the  word,  and  so  must  actually  use  a  name  of  the  word  and  a 
name  of  the  name  of  the  word.  Thus  we  have  to  use  the  awkward  double 
quotation  marks  to  make  the  simple  statement  about  the  particular  word 
"Georgia"  that,  if  one  has  a  word  without  quotation  marks,  one  forms  its 
name  by  enclosing  it  in  quotation  marks. 

Such  awkward  situations  arise  only  when  one  is  discussing  names  of 
words  or  statements  (as  we  are  doing  now).  Throughout  most  of  the 
study  of  logic  we  make  statements  only  about  other  statements  and  not 
about  names  of  other  statements.  Thus  we  have  to  USE  names  of  state- 
ments (requiring  use  of  single  quotation  marks),  but  we  do  not  have  to 
discuss  names  of  statements  (and  so  have  no  need  for  quotation  marks 
within  quotation  marks) .  One  place  where  one  does  have  to  discuss  names 
of  statements  is  in  the  proof  of  Godel's  theorem  that  a  consistent  logic  ade- 
quate for  mathematics  is  incomplete.  In  this  proof  we  have  to  use  names 
of  names  of  statements  in  order  to  discuss  the  names  of  statements.  Failure 
to  comprehend  this  point  is  one  of  the  major  causes  of  difficulty  in  com- 
prehending the  proof  of  Godel's  theorem. 
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For  the  greater  part  of  logic,  no  such  difficulties  arise.  Thus  for  the 
greater  part  of  logic  one  can  be  rather  careless  about  the  use  of  quotation 
marks  without  getting  into  any  difficulty.  Such  carelessness  is  usual,  and 
in  the  future  we  shall  indulge  in  such  carelessness  ourselves.  Up  to  now, 
we  have  scrupulously  tried  to  be  correct.  When  referring  to  statements, 
we  have  used  names  of  the  statements,  to  wit,  the  statements  enclosed  in 
quotation  marks.  This  has  been  done  consistently,  both  for  statements  of 
English  and  statements  of  symbolic  logic  except  in  displayed  formulas, 
where  the  act  of  displaying  was  considered  tantamount  to  enclosure  in 
quotation  marks.  Similarly,  use  in  a  truth-value  table  was  considered 
tantamount  to  enclosure  in  quotation  marks  of  the  "P",  "Q",  etc.,  which 
required  enclosure  in  quotation  marks.  We  took  "T"  and  "F"  as  being 
the  names  of  truth  values  and  hence  not  requiring  enclosure  in  quotation 
marks.  For  them,  use  in  a  truth-value  table  was  not  considered  tanta- 
mount to  enclosure  in  quotation  marks. 

As  an  illustration  of  the  use  of  quotation  marks  when  making  statements 
about  statements,  consider  the  following  sentences  of  Sec.  1  of  Chapter  II 
(which  we  take  the  license  of  quoting  w^ithout  enclosing  in  quotation 
marks) : 

To  illustrate,  let  "P"  and  "Q"  be  translations  of  'Tt  is  raining  now"  and 
"It  is  not  cloudy  now."  Then  "(P&Q)"  is  a  translation  of  "It  is  raining 
now  and  it  is  not  cloudy  now,"  and  "'^P"  and  "'~Q"  are  translations 
of  "It  is  not  raining  now"  and  "It  is  cloudy  now." 

In  such  an  explanation,  carelessness  w^th  quotation  marks  is  inadvisable, 
and  we  shall  not  indulge  in  it.  However,  in  extensive  formal  developments, 
which  make  up  the  bulk  of  many  subsequent  chapters,  one  can  omit  all 
quotation  marks  without  danger  of  confusion  and  with  a  saving  in  com- 
plexity.   In  such  case,  we  make  such  omission  without  comment. 

Be  it  understood  that  we  are  not  admitting  such  omission  of  quotation 
marks  to  be  correct;  we  are  merely  condoning  it  as  convenient. 

This  is  in  line  with  current  mathematical  practice  in  dealing  with 
symbols.  A  statement  such  as  "If  x  and  y  are  numbers,  then  x  -\-  y  = 
y  +  .r"  violates  the  rule  about  using  names  of  things  when  speaking  of  these 
things.  It  should  properly  be  written  as  "If  'a;'  and  '?/'  are  numbers, 
then  'x  -\-  y'  =  'y  -\-  x\"  However,  in  the  first  version  (without  quotation 
marks),  there  can  be  no  more  doubt  of  the  meaning  intended  than  in  the 
second.  Hence,  if  it  is  understood  that  the  first  is  merely  shorthand  for 
the  second,  there  can  be  no  harm  in  using  it. 

In  a  printed  text,  the  "x"  and  "?/"  in  such  statements  are  customarily 
written  in  italics,  thereby  decreasing  the  likelihood  of  any  misunder- 
standing. 

It  is  of  interest  that  there  is  a  place  in  mathematics  where  confusion 
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occasionally  arises  from  a  failure  to  preserve  a  careful  distinction  between 
an  object  and  its  name.    This  is  in  connection  with  fractions. 

The  fractions  "3/4",  "6/8",  "9/12",  etc.,  are  all  names  of  a  certain 
rational  number,  which  incidentally  has  many  other  names  such  as  "0.75", 

"\/0.5625",  "  f   3a;'  dx'\  etc.     Thus  if  one  writes  "3/4  =  9/12,"  one  is 

Jo 

making  a  statement  about  the  rational  number,  and  names  of  the  rational 
number  appear  in  the  statement.  This  is  as  it  should  be.  However,  if 
one  wi-ites  "3  divides  the  denominator  of  9/12,"  one  is  not  making  a 
statement  about  the  rational  number,  but  about  one  of  its  names.  Thus 
one  should  write  instead  "3  divides  the  denominator  of  '9/12'." 

The  chance  for  confusion  here  is  slight.  However,  one  will  occasionally 
encounter  an  alert  youngster  who  wishes  to  know  why,  if  "3/4  =  9/12," 
one  cannot  replace  "9/12"  by  "3/4"  in  the  statement  "3  divides  the  de- 
nominator of  9/12,"  whereas  it  is  perfectly  correct  to  replace  "9/12  "by 
"3/4"  in  the  statement  "3  is  greater  than  9/12."  The  answer,  of  course, 
is  that  "9/12"  actually  occurs  in  the  second  statement,  whereas  "9/12" 
does  not  actually  appear  in  the  first  statement  but  only  a  name  of  "9/12". 

Since  the  first  statement  is  incorrectly  written,  it  appears  to  contain 
"9/12"  at  a  point  where  it  really  contains  a  name  of  "9/12". 

The  alert  youngster  may  then  inquire  why  one  cannot  replace  "9/12" 
by  "3/4"  in  the  correctly  formulated  statement  "3  divides  the  denominator 
of  '9/12',"  since  this  also  contains  "9/12".  The  answer  in  this  case  is  that 
the  occurrence  of  "9/12"  in  the  correct  version  of  the  sentence  is  a  purely 
typographical  occurrence,  like  the  "s"  in  "sin  a;"  or  the  "rf"  in  ^'dy/dx^\ 
If  one  is  given  "s  =  d'\  one  would  not  think  of  replacing  "s"  by  "d"  in 
"sin  x"  or  "d"  by  "s"  in  ^^dy/dx'\  If  one  had  used  some  other  name  for 
"9/12",  there  would  have  been  no  question  of  substituting  "3/4".  Thus 
one  might  have  written  "3  divides  the  denominator  of  the  fraction  got  by 
placing  a  '9'  over  a  bar  and  a  '12'  under  the  bar." 

The  gist  of  the  matter  is  that,  if  we  have  a  statement  such  as  "3  is  greater 
than  9/12"  about  the  rational  number  9/12  and  containing  a  name  "9/12" 
of  this  rational  number,  we  can  replace  this  name  by  any  other  name  of  the 
same  rational  number,  for  instance,  "3/4".  If  we  have  a  statement  such  as 
"3  divides  the  denominator  of  '9/12'  "  about  a  name  of  a  rational  number 
and  containing  a  name  of  this  name,  we  can  replace  this  name  of  the  name 
by  some  other  name  of  the  same  name,  but  not  in  general  by  the  name  of 
some  other  name,  even  if  it  is  a  name  of  some  other  name  of  the  same 
rational  number. 

As  we  said  once  before,  failure  to  observe  such  distinctions  carefully  can 
seldom  lead  to  confusion  in  logic  and  still  less  seldom  in  mathematics,  and 
so  in  much  of  our  text  we  shall  omit  the  quotation  marks  that  should  appear. 
In  so  doing,  we  follow  accepted  mathematical  practice. 
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Meanwhile,  there  is  one  important  point  concerning  names  within  our 
symbohc  logic.  Names  are  important  constituents  of  statements  in  sym- 
bolic logic  even  as  in  more  familiar  languages.  In  using  the  English  lan- 
guage, we  always  assume  that  our  sentences  have  meaning  and  that  the 
names  which  occur  in  them  are  the  names  of  something.  Not  so  in  our 
symbolic  logic.  There  is  no  requirement  that  a  statement  of  symbolic 
logic  have  meaning.  Consequently  the  names  which  occur  in  such  state- 
ments need  not  be  names  of  anything.  This  is  very  convenient,  since  we 
are  thus  entitled  to  use  names  without  first  (or  ever)  being  assured  that 
they  are  the  names  of  something.  We  shall  amplify  this  point  when  we 
introduce  names  within  our  symbolic  logic  (see  Chapter  VIII). 


CHAPTER  IV 
AXIOMATIC  TREATMENT  OF  THE  STATEMENT  CALCULUS 

1.  The  Axiomatic  Method  of  Defining  Valid  Statements.  In  effect  the 
axiomatic  definition  of  valid  statements  is  accomplished  as  follows.  Certain 
statements,  chosen  rather  arbitrarily,  are  called  axioms.  We  then  consider 
to  be  valid  statements  just  those  statements  which  are  either  axioms,  or 
can  be  derived  from  several  axioms  by  successive  uses  of  modus  ponens. 

We  shall  make  the  above  definition  more  precise  in  the  next  section,  but 
for  the  moment  we  wish  to  discuss  certain  general  aspects  of  the  axiomatic 
method.  In  the  first  place,  modus  ponens  depends  only  on  the  forms  of  the 
statements  involved,  and  not  at  all  on  their  meanings.  Hence,  if  the  choice 
of  axioms  is  made  to  depend  upon  form  only,  then  the  axiomatic  definition 
of  valid  statement  will  depend  entirely  on  the  forms  of  the  statement.  This 
accords  with  our  stipulation  that  our  symbolic  logic  shall  be  independent 
of  the  meanings  of  the  statements. 

In  Chapter  II,  we  learned  the  method  of  truth-value  tables  whereby  we 
could  classify  certain  statements  as  universally  valid.  However,  this 
applied  only  to  statement  formulas,  which  are  a  very  specialized  type  of 
statement.  Unfortunately,  the  very  convenient  method  of  truth-value 
tables  cannot  be  generalized  to  general  types  of  statements,  for  which  we 
must  use  some  more  general  method.  So  far,  only  the  axiomatic  method  is 
kno^vn  for  handling  the  most  general  types  of  statements.  One  of  the 
unsolved  problems  is  to  find  some  method  other  than  the  axiomatic  method 
for  handling  the  most  general  types  of  statements. 

It  will  be  some  time  before  we  are  prepared  to  give  a  complete  list  of 
axioms.  In  the  present  chapter  we  shall  list  a  subset  of  the  axioms  with  the 
following  interesting  properties.  In  the  first  place,  any  universally  valid 
statement  formula  can  be  derived  from  axioms  of  this  subset  by  means  of 
modus  ponens.  In  the  second  place,  any  statement  which  can  be  derived 
from  axioms  of  this  subset  by  means  of  modus  ponens  is  a  universally  vahd 
statement  formula.  Accordingly,  the  axioms  of  this  subset  are  known  as 
the  truth-value  axioms. 

Consider  the  two  properties  of  the  truth-value  axioms  which  we  cited  in 
the  preceding  paragraph.  These  properties  are  statements  about  our  sym- 
bolic logic,  and  not  statements  within  our  symbolic  logic.  Accordingly, 
the  proofs  which  we  will  give  of  the  statements  must  be  carried  out  in 
intuitive  logic,  and  not  within  the  symbolic  logic.    We  have  referred  earlier 
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to  this  matter  of  proving  statements  about  the  symbolic  logic,  and  remind 
the  reader  that  we  rigidly  restrict  ourselves  to  the  use  of  a  very  simple 
intuitive  logic  for  such  proofs.  We  postpone  a  more  extensive  discussion 
of  this  intuitive  logic  until  the  next  chapter.  However,  let  us  emphasize 
again  the  distinction  between  a  proof  within  the  symbolic  logic  and  a  proof 
about  the  symbolic  logic.  A  proof  within  the  symbolic  logic  shall  be  a 
sequence  of  uses  of  modus  ponens  applied  to  some  set  of  axioms.  Once 
written  out,  it  can  be  checked  quite  mechanically.  On  the  other  hand,  a 
proof  about  the  symbolic  logic  will  require  more  than  a  rudimentary 
intelligence  for  its  comprehension,  even  though  we  permit  only  quite  simple 
proofs  about  the  symbolic  logic.  A  typical  result  that  we  shall  prove  about 
our  symbolic  logic  is  that  for  statements  of  a  particular  form  proofs  within 
the  symbolic  logic  can  be  found;  in  particular,  the  present  chapter  will  be 
mainly  devoted  to  an  intuitive  proof  about  the  symbolic  logic  that,  for 
each  statement  formula  which  always  takes  the  value  T,  there  can  be  found 
a  proof  within  the  symbolic  logic.  This  intuitive  proof  will  be  construc- 
tive in  that  we  shall  give  very  explicit  directions  for  writing  out  the  sequence 
of  uses  of  modus  ponens  which  will  constitute  the  proof  within  the  symbolic 
logic. 

We  wish  to  make  one  point  clear  about  our  use  of  the  word  ''axiom." 
Originally  the  word  was  used  by  Euclid  to  mean  a  "self-evident  truth." 
This  use  of  the  word  ''axiom"  has  long  been  completely  obsolete  in  mathe- 
matical circles.  For  us,  the  axioms  are  a  set  of  arbitrarily  chosen  statements 
which,  together  with  the  rule  of  modus  ponens,  suffice  to  derive  all  the 
statements  which  we  wish  to  derive.  This  corresponds  to  the  standard 
mathematical  usage  of  the  word  "axiom." 

2.  The  Truth-value  Axioms.  We  present  in  the  present  chapter  just 
that  subset  of  all  the  axioms  which  we  have  chosen  to  call  the  truth-value 
axioms,  namely,  a  subset  such  that  the  axioms  of  the  subset  plus  all  the 
statements  which  can  be  derived  from  them  by  successive  uses  of  modus 
ponens  constitute  exactly  the  set  of  statement  formulas  which  always  take 
the  value  T.  One  obvious  way  to  choose  such  a  set  of  truth-value  axioms 
is  to  have  the  truth-value  axioms  consist  of  exactly  those  statement  for- 
mulas which  always  take  the  value  T;  this  procedure  is  followed  by  some 
logicians.  However,  we  consider  it  more  elegant  and  more  instructive  to 
choose  a  smaller  subset  of  the  axioms  to  be  our  truth-value  axioms,  and 
we  do  so. 

Our  truth-value  axioms  shall  consist  of  an  infinite  number  of  statements, 
but  they  will  be  of  only  three  general  forms,  which  we  call  axiom  schemes, 
to  wit: 

1.  P  D  PP. 

2.  PQ  D  P. 

Z.  P  D  Q.  D    ~(Qi^)  D  ~(i2P). 
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The  precise  definition  of  the  truth-value  axioms  is: 
If  P,  Q,  and  R  are  statements,  not  necessarily  distinct,  then  each  of  the 
following  is  an  axiom: 


1. 

P  D  PP. 

2. 

PQ  3  P. 

3. 

P  J  Q.  D 

iQR)  D  ~(i2P). 

We  shall  refer  to  the  three  forms  above  as  Axiom  schemes  1,  2,  and  3, 
respectively.  Any  particular  axiom  having  one  of  these  forms  will  be 
referred  to  as  an  instance  of  the  axiom  scheme  in  question. 

Because  we  can  let  P  and  Q  be  the  same,  we  infer  that  PP  D  P  is  an 
axiom,  being  an  instance  of  Axiom  scheme  2.  As  another  illustration  of  an 
axiom,  we  take  P  to  be  ^^R  in  Axiom  scheme  1,  and  infer  that 
'-^R  D  '^R'^R  is  an  axiom.  Written  in  unabbreviated  form,  this  is 
'^('^R'^('^R'^R)),  which  is  the  unabbreviated  form  of  Ry^R^^R.  So 
Ry^R'^R  is  also  an  axiom,  being  an  instance  of  Axiom  scheme  1. 

The  axioms  defined  by  Axiom  schemes  1,  2,  and  3  are  called  the  truth- 
value  axioms  and  are  only  a  subset  of  all  the  axioms  which  we  shall  even- 
tually present.  It  is  our  intention  to  define  our  axioms  so  that  they  can  be 
identified  by  reference  to  their  form  only  and  without  any  reference  to- 
their  meaning.  It  will  be  observed  that  we  have  abided  by  this  restriction 
in  defining  the  truth-value  axioms. 

We  shall  show  that  the  truth-value  axioms  together  with  the  statements 
which  can  be  derived  from  them  by  successive  uses  of  modus  ponens  con- 
stitute exactly  the  statement  formulas  which  always  take  the  value  T.  We 
now  define  precisely  the  vague  phrase  "the  statements  which  can  be 
derived  from  them  by  successive  uses  of  modus  ponens."  We  first  general- 
ize this  to  the  case  where  Q  is  derived  not  merely  from  the  axioms,  but  also 
from  certain  assumptions  Pi,  Pg,  .  .  .  ,  P„.    We  introduce  the  notation: 

P„  P„...,P^   h  Q- 

The  intuitive  meaning  of  this  is  to  be  that  either  Q  is  an  axiom  or  else  Q 
is  one  of  the  P's,  or  else  Q  is  derived  from  the  axioms  and  the  P's  by  suc- 
cessive uses  of  modus  ponens.    The  precise  definition  is  as  follows: 

Pi,  Pa,  .  .  .  ,  Pn  \-  Q  indicates  that  there  is  a  sequence  of  statements 
Si,  S2,  .  .  .  ,  Ss,  such  that  S,  is  Q  and  for  each  Si  either: 

(1)  Si  is  an  axiom. 

(2)  Si  is  a  P. 

(3)  Si  is  the  same  as  some  earlier  S,-. 

(4)  Si  is  derived  from  two  earlier  *S's  by  modus  ponens. 

More  precise  versions  of  (3)  and  (4)  are : 
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(3)  There  is  a  j  less  than  i  such  that  Si  and  Sj  are  the  same. 

(4)  There  are  j  and  k,  each  less  than  i,  such  that  Sk  is  /S,  D  /S^. 

In  the  precise  version  of  (4),  Sj  is  the  minor  premise,  and  Sk  is  the  major 
premise. 

The  sequence  of  statements  Si,  S2,  .  .  .  ,  S,  is  called  a  demonstration  of 
Pi,  P2,  .  .  .  ,  Pn\-Q.  This  sequence  Si,  S2,  .  .  . ,  S,  constitutes  a  proof  within 
the  symbolic  logic  that  Q  is  a  logical  consequence  of  the  assumptions 
Pi,  P2,  .  .  .  ,  Pn-  That  is  why  the  sequence  is  called  a  demonstration.  The 
/S's  which  make  up  the  sequence  are  called  the  steps  of  the  demonstration. 
The  reader  should  note  the  analogy  with  the  more  standard  mathematical 
usage  of  the  words  "demonstration"  and  "steps  of  a  demonstration." 

By  (3),  one  is  permitted  to  repeat  any  step  of  a  demonstration  as  often 
as  desired.  In  constructing  a  demonstration,  this  usually  serves  no  useful 
purpose,  and  it  would  ordinarily  be  silly  to  repeat  steps.  However,  it  will 
simplify  certain  of  our  proofs  of  theorems  about  the  symbolic  logic  if 
repetition  of  steps  is  permitted.  As  repetition  of  steps  is  harmless,  we 
therefore  permit  it. 

Notice  that,  if  a  sequence  of  statements  Si,  S2,  .  .  .  ,  S,  be  written  down, 
then  it  is  a  perfectly  mechanical  procedure  to  check  whether  it  is  or  is  not  a 
demonstration  of  Pi,  P2,  .  .  .  ,  P„  [-  Q. 

The  symbol  "p'  is  called  a  turnstile.  The  statement  Pi,  P2,  .  .  .  ,  Pn\- Q 
is  read  as  "Pi,  P2,  .  .  .  ,  Pn  yield  Q".  In  case  there  are  no  P's,  we  write 
simply  \-  Q  and  read  "yields  Q".  This  case  is  of  especial  importance,  since 
it  signifies  that  Q  is  derived  from  the  axioms  alone,  without  any  assump- 
tions Pi,  P2,  •  .  •  ,  Pn-  Hence  \-  Q  signifies  that  Q  is  a  consequence  of  the 
axioms,  or  that  Q  is  a  provable  statement  of  our  symbolic  logic. 

One  point  which  is  made  clear  by  our  precise  definition  of  [-  Q  is  that,  in 
deriving  Q  from  the  axioms  by  a  succession  of  uses  of  modus  ponens,  one 
can  apply  modus  ponens  not  merely  to  the  axioms  but  also  to  any  results 
derived  from  them. 

Another  point  about  our  definition  of  Pi,  P2,  .  .  .  ,  P„  [-  Q  is  that  it  states 
that  Q  can  be  derived  from  the  assumptions  Pi,  P2,  .  .  .  ,  P„  with  the  help 
of  all  the  axioms,  and  not  merely  with  the  help  of  the  truth- value  axioms. 
Temporarily,  throughout  the  remainder  of  the  present  section,  we  shall  let 
Pi,  Pz,  .  .  .  ,  P„  |-*  Q  denote  that  Q  can  be  derived  from  the  assumptions 
Pi,  P2,  .  .  .  ,  P„  with  the  help  of  the  truth-value  axioms  alone.  The  defini- 
tion of  Pi,  P2,  .  .  .  ,  P„  |-*  Q  is  just  like  that  of  Pi,  P2,  .  .  .  ,  P„  [-  Q  except 
that  we  replace  (1)  by: 

(1*)  Si  is  a  truth-value  axiom. 

We  can  now  express  our  earlier  clumsy  and  somewhat  vague  statement 
".  .  .  the  truth-value  axioms  together  with  the  statements  which  can  be 
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derived  from  them  by  successive  uses  of  modus  ponens  constitute  exactly 
the  statement  formulas  which  always  take  the  value  T"  in  the  neater  and 
more  precise  form  "\-*  Q'd  and  only  if  Q  is  a  statement  formula  which  always 
takes  the  value  T." 

The  statement  in  quotation  marks  just  above  is  an  equivalence.  Al- 
though it  is  an  intuitive  equivalence,  nevertheless,  many  of  the  remarks  in 
Chapter  II  about  proving  equivalences  apply,  and  we  can  (and  shall) 
prove  it  by  proving  the  two  implications : 

If  \-*  Q,  then  Q  is  a  statement  formula  which  always  takes  the  value  T. 

If  Q  is  a  statement  formula  which  always  takes  the  value  T,  then  [-*  Q. 

We  now  prove  the  first  of  these.  Assume  \-*  Q.  By  the  definition  of 
|-*  Q,  there  is  a  sequence  of  statements  Si,  S2,  .  .  .  ,  S,  such  that  S,  is  Q  and 
for  each  Si  either: 

(1*)  *Si  is  one  of  the  axioms  specified  by  Axiom  schemes  1,  2,  or  3. 

(3)  There  is  a  j  less  than  i  such  that  Si  and  Sj  are  the  same. 

(4)  There  are  j  and  k,  each  less  than  i  such  that  Sk  is  Sj  D  Si. 

Note  the  absence  of  (2),  due  to  the  lack  of  P's  before  the  [-*  Q. 

We  now  prove  by  induction  on  m  that  S„  is  a  statement  formula  which 
always  takes  the  value  T.  If  m  =  1,  we  must  have  (1*).  By  making, 
truth-value  tables  for  our  three  axiom  schemes,  we  verify  that  each  axiom 
specified  by  Axiom  schemes  1,  2,  or  3  is  a  statement  formula  which  always 
takes  the  value  T.  So  Si  is  a  statement  formula  which  always  takes  the 
value  T.  Now  suppose  that  each  of  Si,  S2,  ■  ■  ■  ,  S„  is  a  statement  formula 
which  always  takes  the  value  T,  and  put  i  —  7n  -\-  I.  If  (1*),  we  use  the 
same  arg-ument  as  for  w  =  1.  If  (3),  then  ;S„+i  is  the  same  as  some  earlier 
S,  and  so  is  a  statement  formula  which  always  takes  the  value  T.  If  (4), 
then  Sk  is  Sj  D  S^+i,  where  each  of  Sj  and  Sk  occurs  before  Sm+i.  So  each 
of  Sj  and  Sj  D  aS„+i  is  a  statement  formula  which  always  takes  the  value  T. 
Inspection  of  the  truth-value  table  for  D  shows  that  in  this  case  S^+i  must 
also  always  take  the  value  T.    Also  S^+i  must  be  a  statement  formula. 

Since  >S„  is  a  statement  formula  which  always  takes  the  value  T  for  each 
m,  we  put  m  =  s  and  infer  that  Q  is  a  statement  formula  which  always  takes 
the  value  T. 

The  converse,  "If  Q  is  a  statement  formula  which  always  takes  the  value 
T,  then  |-*  Q,"  is  much  harder  to  prove.  We  devote  the  next  three  sections 
to  its  proof.  However,  throughout  the  next  three  sections  we  shall  use  |- 
instead  of  [-*.  We  are  not  doing  this  with  the  understanding  that  |-  stands 
for  |-*,  but  we  actually  intend  [-.  The  point  is  that  the  results  with  \-*, 
while  interesting,  are  not  what  we  shall  need  for  later  developments.  It  is 
the  results  with  |-  that  we  shall  need  later,  and  it  is  these  which  we  shall 
prove.    However,  for  the  benefit  of  any  interested  reader,  we  point  out  that 
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throughout  the  next  three  sections  all  our  proofs  are  such  that  all  our 
theorems  and  proofs  would  still  hold  if  we  should  replace  \-  by  |-*.  Thus  in 
Sec.  5  we  prove  Thm.IV.5.3,  which  says  that,  if  Q  is  a  statement  formula 
which  always  takes  the  value  T,  then  \-  Q.  However,  our  proof  would 
equally  well  prove  "If  Q  is  a  statement  formula  which  always  takes  the 
value  T,  then  [-*  Q."  As  the  latter  statement  is  of  some  interest,  we  there- 
fore claim  that  we  prove  it,  even  though  all  that  we  state  at  the  time  and 
ever  make  use  of  afterwards  is  the  weaker  statement  with  [-. 

The  weaker  result  with  |-  is  a  very  useful  result,  since  it  gives  us  quite  a 
handy  way  of  proving  [-  Q  for  a  considerable  number  of  Q's. 

We  shall  refer  to  this  result  as  the  "truth-value  theorem." 

EXERCISES 
IV.2.1.     Prove  P  [-  PP. 
IV.2.2.     Prove  PP\-P. 

IV.2.3.  Prove  ^(^PP),  ^P  D  ^P^P  D  P.  (Hint.  Put  ~P,  ~P, 
and  P  for  P,  Q,  R  in  Axiom  scheme  3.) 

IV.2.4.     Prove  that,  if  P„  P^,  .  .  .  ,  P„  p  Q,  then  P„  P„...,P^^Q. 

3.  Properties  of  [-.  Various  properties  of  |-  are  almost  obvious,  but  it  is 
perhaps  worth  while  stating  and  proving  them  explicitly. 

Theorem  IV.3.1.     If  Pi,  .  .  .  ,  P„  h  Q,  then  Pi,  ...  ,  P„,  R„.  .  .,  R^^Q. 

Proof.  Clearly  any  sequence  of  S's  which  will  serve  for  a  demonstration 
of  Pi,  .  .  .  ,  P„  |-  Q  will  serve  equally  well  for  a  demonstration  of  Pi,  ...  ,  P„, 
Ri,  .  .  .  ,  Rm\-  Q- 

Theorem  IV.3.2.  If  Pi,  .  .  .  ,  P„  |-  Q,  and  Q^,  .  .  .  ,  Q^  ^  R,  then 
P„...,P^,Q„...,Q„[-R. 

Proof.  Let  iSi,  .  .  .  ,  ^^  be  a  demonstration  of  Pi,  .  .  .  ,  P„  [-  Qi  and 
2i,  .  .  .  ,  S^  be  a  demonstration  of  Qi,  .  .  .  ,  Q„\-  R.  Then  each  S  is  either 
an  axiom  or  a  P  or  an  earlier  >S  or  derived  from  two  earlier  >S's  by  modus 
ponens,  and  S,  is  Qi.  Likewise  each  S  is  either  an  axiom  or  a  Q  or  an  earlier 
S  or  derived  from  two  earlier  S's  by  modus  ponens,  and  2^  is  R.  If  we  now 
construct  a  new  sequence  to  consist  of  all  the  S's  in  order  followed  by  all 
the  S's  in  order  then  we  have  a  demonstration  of  Pi,  ...  ,  P„, 
Q2,  ■  ■  ■  ,  Qm\-  R-  To  see  this,  note  that  the  final  step  is  S^,  which  is  R. 
Those  S's  (if  any)  which  are  Qi  are  now  accounted  for  as  being  repetitions 
of  S,.    All  S's  and  all  other  S's  are  accounted  for  as  before. 

Theorem  IV.3.3.  If  Pi,  .  .  .  ,  P„  h  Q„  R„  ...  ,R„[-  Q„  and  Qi,  .  .  .  , 
Q„  h  S,  then  Pi,  ...  ,  P„,  Pi,  ...  ,  P„,  Q^,...,Q,^S. 

Proof.  From  Pi,  .  .  .  ,  P„  h  Qi  and  Q„  .  .  .  ,  Q,^  S  we  get  Pi,  ...  ,  P„. 
Q2,  ...  ,Qa\-S  by  Thm.IV.3.2.  From  Pi,  .  .  .  ,  P^  [-  Q2  and  P,,  .  .  .  ,  P„, 
Q2,  .  .  .  ,  Q,  [-  S  we  get  P„  .  .  .  ,  P„,  Rr,  ...  ,  R^,  Qs,  ...  ,  Q,yS  by 
Thm.IV.3.2. 
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Theorem  IV.3.4.     P,  P  D  Q[-Q. 

Proof.  A  sequence  of  S's  that  will  serve  as  a  demonstration  of  this  is 
clearly: 

S^:    P. 

S,:    P  D  Q. 

S,:    Q. 

Theorem  IV.3.5.  li  P„  . .  .  ,  P„\-  Q  and  R„  .  .  .  ,  R„[-  Q  D  S,  then 
P„...,P^,R„...,R„[-S. 

Proof.  In  Thm.IV.3.3,  take  Qi  to  be  Q  and  Q2  to  be  Q  D  S,  and  use 
Thm.IV.3.4. 

We  note  the  particularly  useful  special  case  of  Thm.IV.3.2: 

Theorem  IV.3.6.     If  [-  Qi  and  Qi,  .  .  .  ,  Q„  h  R,  then  Q2,  .  .  .  ,  Q„\- R. 

By  applying  this  successively  m  times,  we  deduce: 

Theorem  IV.3.7.     li\-Qi,\-Q2,  ■  .  .  ,\-Qm,B,ndQi, .  .  .  ,Q„\-R,  then  f-  R. 

This  states  the  obvious,  but  very  useful,  principle  that,  if  Qj,  .  .  .  ,  Q„  j-  i2, 
and  if  each  of  Qi,  .  .  .  ,  Q„  can  be  derived  by  the  axiomatic  method,  then  R 
can  also  be  derived  by  the  axiomatic  method. 

Theorem  IV.3.8.  li  Si,  S2,  ...  ,  S^  is  a  demonstration  of  Pj,  P2,  •  •  •  , 
P„  |-  Q,  then  for  1  <  ^  <  s,  aSi,  ^Sa,  .  .  .  ,  S,-  is  a  demonstration  of  Pj,  Pj,  .  .  .  , 
P.  h  S,. 

Theorem  IV.3.9.  If  Pi,  .  .  .  ,  P„  is  any  permutation  of  Pi,  .  .  .  ,  P„  and 
if  Pi,  .  .  .  ,  P„  h  Q,  then  Pi,  .  .  .  ,  P„  [-  Q. 

Theorem  IV.3.10.  If  Pi,  P2,  .  .  .  ,  P„  h  Q,  then  Pi,  Pi,...,P„P2  ..., 
Pn  h  Qj  and  vice  versa. 

EXERCISES 

IV.3.1.     Prove  that,  if  P  [-  Q  and  Q  h  ^,  then  P^R. 
IV.3.2.     Prove  that,  li^  P  D  Q,  then  P\-Q. 

4.  Preliminary  Theorems,  The  reader  has  become  familiar  with  the 
use  of  truth-value  tables.  As  the  axiomatic  method  is  quite  different,  the 
reader  must  make  a  deliberate  effort  not  to  carry  over  to  the  axiomatic 
method  any  habits  which  he  has  acquired  while  using  the  truth-value 
tables.  Thus  by  truth-value  tables  one  easily  shows  the  commutativity, 
associativity,  and  distributivity  of  &  and  v  (see  Ex.  II.3.6,  II.3.7,  II.3.8, 
and  II. 3. 9).  There  is  accordingly  a  temptation  to  make  immediate  use  of 
these  properties  in  the  axiomatic  method.  Thus,  from  Axiom  scheme  2, 
PQ  D  P,  one  is  tempted  to  infer  QP  D  P  immediately  by  commutativity 
of  &.  However,  one  does  not  have  commutativity  of  &  at  first  with  the 
axiomatic  method,  nor  is  it  easy  to  deduce.  Not  until  Thm.TV.4.13  do 
we  deduce  the  commutativity  of  &,  and  then  only  in  a  limited  form  which 
does  not  permit  indiscriminate  replacement  of  PQ  by  QP.    Thus  QP  D  P 
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is  not  available  until  Thm.IV.4.18,  which  gives  a  generalized  form  of  both 
PQ  D  P  and  QP  D  P. 

After  we  have  proved  the  truth-value  theorem  that  |-  P  if  P  always  takes 
the  value  T,  we  can  proceed  to  do  anything  that  we  could  do  with  truth- 
value  tables.    Until  then,  we  have  to  proceed  very  cautiously. 

Theorem  IV.4.1.     P  D  Q,  Q  D  R  \ (~PP),  where  P,  Q,  and  R  are 

statements. 

Proof.     Let  P,  Q,  and  R  be  statements.    Define  Si,  .  .  .  ,  S5  as  follows: 

Si-.    P  D  Q. 

S,:     ^(Q^R). 

Ss:    P  D  Q.  D  .^(Q^R)  D  ~(~PP). 

S,:     '^(Q^R)  D  '^(^RP). 

S,:     ~(~PP). 

This  sequence  of  S's  constitutes  a  demonstration,  as  we  see  by  noting  the 
following  facts.  S^  is  ^('^RP).  Si  and  ^2  are  P  D  Q  and  Q  D  R.  S3  is 
an  instance  of  Axiom  scheme  3.  Also  S3  has  the  form  Si  D  S^,  so  that  S^ 
is  derived  by  use  of  modus  ponens  from  Si  as  minor  premise  and  S3  as 
major  premise.  *S4  has  the  form  S2  D  S^,  so  that  S^  is  derived  by  use  of 
modus  ponens  from  S2  as  minor  premise  and  S^  as  major  premise. 

If  we  already  had  the  commutativity  of  &,  we  could  interchange  '^R 
and  P  in  the  conclusion  of  this  theorem,  and  write  PDQ,QDR\-PDR. 
However,  we  do  not  yet  have  commutativity,  and  so  this  result  must  wait 
awhile. 

The  fact  that  Thm.IV.4.1  is  true  no  matter  what  statements  are  taken 
for  P,  Q,  and  R  is  due  to  the  fact  that  any  statements  can  be  taken  for 
P,  Q,  and  R  in  the  axiom  schemes  and  modus  ponens.  For  the  same  reason, 
a  similar  freedom  in  the  choice  of  P,  Q,  R,  etc.,  holds  for  all  theorems.  This 
will  be  taken  for  granted  hereafter,  and  not  mentioned  explicitly. 

Theorem  IV.4.2.     \-  '^('^PP). 

Proof.     Define  ^S,,  .  .  .  ,  ^s  as  follows: 
P  D  PP. 
^(PP^P). 

P     D     PP.     D     .r^(PPr^P)     D     '^(^PP). 

^(PP^P)  D  ^(-^PP). 


Si 
S2 
S3 
S, 
S., 


i'-PP). 


/-^/(  r-^ 


This  sequence  of  S's  constitutes  a  demonstration.  S5  is  '^('^PP).  Si  is 
an  instance  of  Axiom  scheme  1 .  ^§2  is  an  instance  of  Axiom  scheme  2  with 
P  in  place  of  Q.  S3  is  an  instance  of  Axiom  scheme  3  with  PP  in  place  of 
Q  and  P  in  place  of  R.  S3  has  the  form  >Si  D  Si,  so  that  Si  follows  by 
modus  ponens  from  Si  and  ^3.  Similarly,  S5  follows  by  modus  ponens 
from  S2  and  S^. 

We  shall  not  write  out  in  full  any  more  demonstrations,  but  we  shall 
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always  give  explicit  instructions  so  that  anyone  who  desires  can  write  out 
any  or  all  of  the  demonstrations.  For  instance,  instead  of  writing  out  the 
demonstration  for  Thm.IV.4.2,  we  could  have  given  the  following  instruc- 
tions : 

Proof.  Put  PP  for  Q  and  P  for  R  in  Thm.IV.4.1.  Then  P  D  Q  becomes 
an  instance  of  Axiom  scheme  1  and  Q  D  R  becomes  an  instance  of  Axiom 
scheme  2.    Then  by  Thm.IV.3.7,  we  get  |-  ^(^PP). 

Theorem  IV.4.3. 
I.     I p  D  p. 

II.     h  ^PwP. 

Proof.  Put  ~P  for  P  in  Thm.IV.4.2.  This  gives  \-  ~(~~P~P), 
which  is  the  unabbreviated  form  of  both  |-  ~~P  D  P  and  j-  ^PwP. 

Let  us  make  one  point  clear  about  this  matter  of  substituting  ■^P  for  P 
in  Thm.IV.4.3.  This  does  not  mean  that  one  gets  a  demonstration  of 
[-  r^r^p  D  P  by  writing  down  the  five  steps  of  the  demonstration  of 
|-  /^(.-^PP)  and  adding  /-^(/^'^P'^P)  as  a  sixth  step  with  the  explanation 
that  it  comes  from  the  fifth  step  by  replacing  P  by  '^P.  No  such  proce- 
dure for  getting  a  step  from  a  preceding  step  is  permitted  by  our  definition 
of  |-.  What  one  must  do  to  get  a  demonstration  of  [-^^'^P  D  P  is  to  re- 
place P  by  ~P  in  every  step  of  the  demonstration  of  ]-  '-^(^^PP).  Then 
all  steps  that  were  axioms  will  again  be  axioms,  all  steps  that  were  repeti- 
tions of  previous  steps  will  again  be  repetitions  of  previous  steps,  and  all 
steps  that  were  derivable  from  two  earlier  steps  by  modus  ponens  will 
again  be  derivable  from  two  earlier  steps  by  modus  ponens. 

A  similar  analysis  will  show  that,  if  one  has  assumptions  to  the  left  of 
the  yields  sign,  as  in  Thm.IV.4.1,  one  can  still  replace  each  P,  Q,  R,  etc., 
by  any  other  statement,  provided  that  one  does  so  for  every  occurrence  on 
both  sides  of  the  jdelds  sign.  Thus,  from  Thm.IV.4.1,  we  can  infer  '^P  D  Q, 
Q  D  ^R  h  ^'^R  3  P,  but  we  cannot  infer  P  D  Q,Q  D  R^  ~~P  D  P. 

Theorem  IV.4.4.     \-  ^{QR)  D  {R  D  -Q). 

Proof.  Put  '^'^Q  for  P  in  Axiom  scheme  3.  Take  the  result  as  major 
premise  and  Thm.  IV.4.3  as  minor  premise,  and  use  modus  ponens. 

Theorem  IV.4.5.     ^  R  D R. 

Proof.  Put  '-^R  for  Q  in  Thm. IV.4.4  for  major  premise,  and  put  R  f or  P 
in  Thm.IV.4.2  for  minor  premise  (and  use  modus  ponens,  naturally). 

We  can  rewrite  Thm.I V.4.5  as  h  P  D  '^^P.  Together  with  Thm.IV.4.3, 
this  should  give  \-  P  =  '^^^P.  However,  we  shall  not  be  able  to  infer 
\-  P  =  r^r.^P  from  \-  P  D  '-^'^P  and  |-  '^-^P  D  P  until  we  can  prove 
P,  Q  h  PQj  and  this  will  not  be  proved  until  late  in  the  section  (it  follows 
from  Thm.IV.4.22).  Even  after  we  get  \-  P  =  '~'~P,  we  shall  not  be  able 
to  substitute  P  for  '^^^P  and  vice  versa  until  in  Chapter  VI,  after  we  have 
proved  the  substitution  theorem.     Nevertheless,  by  proving  Thm.IV.4.3 
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and  Thm.IV.4.5,  we  have  taken  the  first  steps  toward  being  able  to  regard 
P  and  '-^'^P  as  interchangeable. 

Theorem  IV.4.6.     [-Q  D  P.D  .~P  D  ~Q. 

Proof.     Put  '^P  for  R  in  Thm.IV.4.4. 

Theorem  IV.4.7.     '^P  D  ^Q\-Q  ^  P. 

Proof.  Put  '^P,  ~Q,  and  Q  for  P,  Q,  and  R  in  Axiom  scheme  3.  Use 
this  as  a  major  premise  and  ~P  D  ~Q  as  a  minor  premise  and  infer 
'^(~QQ)  3  ~(Q~P).  Now  use  this  as  a  major  premise  and  put  Q  for  P 
in  Thm.IV.4.2  for  a  minor  premise. 

Notice  that  at  present  we  cannot  infer  |-  '^P  D  ~Q.  D  .Q  D  P  from 
Thm. IV.4.7.  Later,  in  Sec.  6,  we  shall  prove  the  deduction  theorem,  that 
ii  P  \-  Q  then  \-  P  D  Q,  but  until  this  has  been  proved,  we  cannot  infer 
\-  ^P  D  ^Q,  D  .Q  D  P  from  Thm.IV.4.7. 

Theorem  IV.4.8.     P  D  Q[-RP  D  QR. 

Proof.  Axiom  scheme  3  gives  P  D  Q  h  ~(QP)  3  ^(RP).  Putting  QR 
and  RP  for  P  and  Q  in  Thm.IV.4.7  gives  ^{QR)  D  -'(PP)  h  ^^  3  QR. 
Then  by  Thm.IV.3.2  (with  n  =  1,  m  =  1),  we  infer  P  D  Q  [-  RP  D  QR. 

Theorem  IV.4.9.     P  D  Q,  R  D  S^  ^{^{QS){PR)). 

Proof.     From  R  D  S  by  Thm.IV.4.8,  we  get 

(1)  PR  D  SP. 
From  P  D  Q  by  Thm.IV.4.8,  we  get 

(2)  SP  D  QS. 

From  (1)  and  (2)  by  Thm.IV.4.1,  we  get  '-(~(Q*S)(PP)). 
Theorem  IV.4.10.     P  D  Q,  Q  D  R,  R  D  S  [-  P  D  S. 
Proof.     From  Q  D  R  we  get 

(1)  {Q  D  R) 

by  Thm.IV.4.5.  From  R  D  S,  we  get  ~*S  D  ~P  by  Thm.IV.4.6.  From 
P  D  Q  and  '-^  D  ~P,  we  get  ~(~(Q--i2)(P'->S))  by  Thm.IV.4.9.  This 
is  the  same  as  ~(Q  D  P.P~*S).  From  this,  we  get  P-^^S  D  ~(Q  D  P)  by 
Thm.IV.4.4.  From  this,  we  get  ~~(Q  D  R)  D  ~(P~>S)  by  Thm.IV.4.6. 
Using  this  as  a  major  premise,  and  (1)  as  a  minor  premise,  we  get  '^(P^S), 
which  isP  D  S. 

Thm.IV.4.10  is  a  generahzed  form  of  P  D  Q,  Q  D  P  |-  ^  ^  ^-  It  is 
interesting  that  we  can  prove  the  generalized  form  before  we  prove  the 
simpler  form. 

Theorem  IV.4.11.     \-  P~~P  D  PR. 
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Proof.    Put  '^^P  and  P  for  P  and  Q  in  Thm.IV.4.8  and  use  Thm.IV.4.3. 
Theorem  IV.4.12.     ^  P  D  P. 
Proof.     By  Axiom  scheme  1, 


n^j  f^^  r"  r"^  r'^  I 


(1)                                   h  '^'^P  3 
By  Thm.IV.4.11,  with P  for  R, 

(2)  \-   ^^P'^r^P    D    P^'^P. 

By  Thm.I V.4. 1 1 ,  with  P  for  72, 

(3)  h  P~~P  3  PP. 
By  (1),  (2),  (3),  and  Thm.IV.4.10, 

(4)  h  ~~P  D  PP. 
By  Axiom  scheme  2, 

(5)  h  ^^  3  ^. 

By  (4),  (5),  and  Thm.IV.4.1,  h  ~(~P~~P).  This  is  |-  ~P  3  ~P. 
Then  by  Thm.IV.4.7,  \- P  D  P. 

Theorem  IV.4.13.     [- RP  D  PR. 

Proof.     Put  P  for  Q  in  Thm.IV.4.8  and  use  Thm.I V.4. 12. 

This  theorem  gives  us  a  hmited  commutativity  of  &. 

**Theorem  IV.4.14.     PDQ,QDR\-PDR. 

Proof.  Put  R  for  S  in  Thm.IV.4.10  and  use  Thm.IV.4.12  with  R  in 
place  of  P, 

Theorem  IV.4.15.     [-  ~(PP)  D  ~(PP). 

Proof.     Put  P  for  Q  in  Axiom  scheme  3  and  use  Thm.IV.4.12. 

Theorem  IV.4.16.     P  D  Q,  R  D  S  ^  PR  D  QS. 

Proof.  From  P  D  Q  and  R  D  S  one  gets  ~(~(Q*S)(PP))  by  Thm. 
IV.4.9.  From  this  by  Thm.I V.4. 15,  one  gets  '^((PR)^(QS)),  which  is 
PR  D  QS. 

CoroUary  1.    P  D  Q[- PR  D  QR. 

Proof.     Take  S  to  he  R  and  use  Thm.IV.4.12  with  R  in  place  of  P. 

Corollary  2.     P  3  ;S  L  PP  3  PS. 

♦Theorem  IV.4.17.     P  D  Q,  P  D  R\- P  D  QR. 

Proof.  Start  with  P  D  Q  and  P  D  P.  By  Thm.I V.4. 16,  PP  D  QR. 
So  by  Axiom  scheme  1  and  Thm.IV.4.1 4,  P  D  QR. 

♦Theorem  IV.4.18.     [-  P1P2  -  •  •  P„  D  P„,  where  1  <  m  <  n. 

Note  that  we  have  not  yet  proved  the  associativity  of  &,  so  that  we  have 
to  associate  to  the  left  in  PjPa    •  •  •    P„   and   understand    it    to   mean 

(•••  ((P,P,)P^)  •..p„_op„. 
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Proof.     First  let  m  <  n.    By  Axiom  scheme  2, 

hPiP2     •••P„    3    P1P2     •••Pn-I 
[-P1P2     •••Pn-1     3    P1P2     •••    Pn-2 


hPiF2     ■•  PmP^.i  3  P1P2 


So  by  repeated  uses  of  Thm.IV.4.14,  we  infer  [-  P1P2  •  •  •  Pn  ^ 
P1P2  •  •  •  Pm-    If  w,  -  1,  we  are  done.    If  m  >  1,  then  by  Thm.IV.4.13, 

h  P,P2  •  •  •  P.  3  P.(PiP2  •  •  •  P.-i). 
Also  by  Axiom  scheme  2 

h  P„(PlP2     •   •   •    Pm-l)     3    P.. 

So  by  further  uses  of  Thm.IV.4.14,  the  desired  result  follows.  If  w  =  n, 
we  proceed  as  in  the  last  two  displayed  formulas. 

Theorem  IV.4.19.     \-  {PQ)R  D  P{QR). 

Note  that  the  rule  of  association  to  the  left  permits  us  to  write  this  as 
l-PQP  D  P{QR). 

Proof.     By  Thm.IV.4.18, 

(1)  h  PQR  3  P, 

(2)  h  PQR  3  Q, 

(3)  h  PQR  3  R. 
By  (2)  and  (3)  and  Thm.IV.4.17, 

(4)  h  PQR  D  QR. 

Then  by  (1)  and  (4)  and  Thm.IV.4.17,  [-  PQR  D  P(QR). 
Theorem  IV.4.20.     \-  P{QR)  D  {PQ)R. 
Proof.     By  Thm.IV.4.13, 

h  P{QR)  D  {QR)P. 


By  Thm.IV.4.19, 
By  Thm.IV.4.13, 
By  Thm.IV.4.19, 
By  Thm.IV.4.13, 


[-  {QR)P  D  Q(RP). 
h  Q{RP)  D  (RP)Q. 
\-  (RP)Q  D  R(PQ). 
[-P(P<?)  D  (PQ)R. 
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By  repeated  uses  of  Thm.IV.4.14,  the  desired  result  follows. 
We  shall  not  be  able  to  infer  \-  {PQ)R  =  P{QR)  from  these  last  two 
lemmas  until  we  have  proved  h -P  ^  {Q  ^  PQ),  which  will  be  Thm.IV.4.22. 
Theorem  IV.4.21.    \-  PQ  D  R.  D  .P  D  (Q  D  R). 
Proof.     By  Thm.IV.4.20, 

(1)  ^PiQ'^R)  3  (PQ)'^R. 
By  Thm.IV.4.3,  and  Thm.IV.4.16,  Cor.  2, 

[-  p {Q^R)  D  P(Q^R). 

This  is 

(2)  h-P~(Q  :^  R)  ^  PiQ'^R). 
By  (1)  and  (2)  and  Thm.IV.4.14, 

hP-'(Q  D  R)  D  (PQ)'^R. 
By  Thm.IV.4.6, 

which  is  the  desired  result. 
**Theorem  IV.4.22.     \-  P  D  {Q  D  PQ). 
Proof.     By  Thm.IV.4.12, 

IPQ  D  PQ. 

Now  use  Thm.IV.4.21  (with  R  replaced  by  PQ)  as  a  major  premise. 

By  means  of  this  theorem  we  can  deduce  ]-  r^^P  =  P  from  Thms.IV.4.3 
and  IV.4.5,  and  \-  (PQ)R  =  P{QR)  from  Thms.IV.4.19  and  IV.4.20.  How- 
ever, we  shall  not  be  able  to  substitute  P  for  ~~P  or  P{QR)  for  iPQ)R 
until  in  Chapter  VI,  after  we  have  proved  the  substitution  theorem. 

Theorem  IV.4.23.     P  D  Q, '^P  D  Q\- Q. 

Proof.  Start  with  P  D  Q  and  ~P  D  Q.  By  Thm.IV.4.6,  we  get 
~Q  D  ^P  and  ~Q  3  ^^P.    Then  by  Thm.IV.4.17,  ~Q  D  ^P^^P. 

Then  by  Thm.IV.4.6,  -'('-P^ P)  D  ~~Q.    That  is,  -^P  D  ~P.  D 

~~Q.  Putting  '^P  for  P  in  Thm.IV.4.12  gives  us  ~'~Q,  and  then  by 
Thm.IV.4.3,  we  get  Q. 

Theorem  IV.4.24.     PQ  D  R,  P~Q  D  R^  P  D  R. 

Proof.  Start  with  PQ  D  R  and  P-Q  D  R.  By  Thm.IV.4.13  and 
Thm.IV.4.14,  we  get  QP  D  R  and  ^QP  D  R.  Then  by  Thm.IV.4.21,  we 
get  QD  {PD  R)  and  ^Q  D  (P  D  R).    So  by  Thm.IV.4.23,  we  get  P  D  R. 

Theorem  IV.4.25.     P  D  Q^P  D Q. 

Proof.     Replace  R  by  ~~Q  in  Thm.IV.4.14  and  use  Thm.IV.4.5. 

Theorem  IV.4.26.     P  D  ^Q[-  P  D  -(QP). 

Proof.     By  Axiom  scheme  2,  ^  QR  D  Q.    So  by  Thm.IV.4.6,  h  '^Q  ^ 
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'^(QR).     However,  by  Thm.IV.4.14,  P  D   '^Q,  ~Q  D   ^(QR)  \-  P  ^ 

-{QR)- 

Corollary.     P  D  ^Q\-  P  D  (Q  D  R). 

Proof.     Put  '^R  in  place  of  R. 

Theorem  IV.4.27.     P  D  ~7^  h  ^  ^  ~(QP). 

Proof.  Start  with  P  D  '^R.  By  Thm.IV.4.26,  we  get  P  D  ^{RQ). 
By  this  and  Thm.IV.4.15,  we  get  P  D  ~(QP)  by  use  of  Thm.IV.4.14. 

Theorem  IV.4.28.     P  D  R^  P  D  {Q  D  R). 

Proof.     Start  with  P   D   R.     By  Thm.IV.4.25,  P   D P.     So  by 

Thm.IV.4.27,  P  D  ~(Q'-'P).    That  is,  P  D  (Q  D  R). 

♦Corollary.     ^  P  D  (Q  D  P). 

Proof.     Replace  P  by  P  and  use  Thm.IV.4.12. 

Theorem  IV.4.29.     P  D  Q,  P  D  ^R\-  P  D  ~(Q  D  P). 

Proof.     Start  with  P    D    Q  and  P    D    ~P.     Then  by  Thm.IV.4.17, 

P     D     Q~P.      So    by    Thm.IV.4.25,    P     D (Q~P).      That    is, 

P  D  ~(Q  D  P). 

EXERCISES 

IV.4.1.     Write  out  a  complete  demonstration  for  Thm.IV.4.5. 

IV.4.2.     Write  out  a  complete  demonstration  for  Thm.IV.4.7. 

IV.4.3.     Write  out  a  complete  demonstration  for  Thm.IV.4.8. 

IV.4.4.     Write  out  a  complete  demonstration  for  Thm.IV.4.9. 

IV.4.6.  State  an  upper  bound  for  the  minimum  number  of  steps  needed 
for  a  complete  demonstration  of  |-  P  D  P  and  justify  your  statement. 

IV.4.6.  Using  only  results  of  Sec.  4  or  earlier  portions  of  the  present 
exercise,  prove: 

(a)    h  PyQ  ^  QyP- 

(b)  \-PD  PyQ. 

(c)  h  -Pm  =>  PiyP2y  ■  ■  •  vP„,  if  1  <  m  <  n. 

(d)  P  D  R,Q  D  R\-PyQ  D  R. 

{Hint.     Proceed  as  in  the  beginning  of  the  proof  of  Thm.IV.4.23.) 

(e)  h^v(QvP)  D  (PvQ)vP. 

(f)  h  iPyQ)yR  3  Pv(QvP). 

(g)  \-PD  (QD  R).  D  .PQ  DR. 
(h)  h  (PyQ)R  3  PRyQR. 

(Hint.  Prove  each  oi[-P  D  .R  D  PRyQR  Sind\-Q  D  .RD  PRyQR  and 
then  use  parts  (d)  and  (g).) 

(i)      \-  PRyQR  D  {PyQ)R. 
(j)      \-PQyR  D   (PyRXQyR). 
(k)     h  (PvP)(QvP)  D  PQyR. 
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(Hint.     By    (h)    \-    {PyR)(QwR)     D    P(QyR)wRiQwR). 
\-  PiQyR)  D  PQvR  and  \-  R(QyR)  D  PQyR,  and  use  (d).) 

(1)     \-  PQvP~Qv~PQv~P~Q. 


Now    prove 


IV.4.7.     Write  out  a  complete  demonstration  for  |-  '^P  D  (P  D  Q). 
IV.4.8.     State  why  one  cannot  prove  Thm.IV.4.20  in  the  same  way  that 
Thm.IV.4.19  was  proved. 
IV.4.9.     Prove  P,Qh^Q- 

5.  The  Truth-value  Theorem.     In  this  section  we  prove  the  truth-value 
theorem  to  the  effect  that,  if  P  always  takes  the  value  T,  then  \-  P. 
We  shall  illustrate  the  method  by  actually  proving 

\-P  D  Q.  D  :.P  D  .Q  D  R:  D  .P  D  R. 

Let  us  first  show  that  this  always  takes  the  value  T,  which  we  do  by  means 
of  the  following  abridged  truth-value  table  in  which  U  denotes  P  D 
(Q  D  R),V  denotes  U  D  {P  D  R)  (that  is,  V  denotes  P  D  .Q  D  R:  D  . 
P  D  P),  and  TF  denotes  (P  D  Q)  D  V  (that  is,  W  denotes  P  D  Q.  D  -.. 
P  D  .Q  D  R:  D  .P  D  R). 


p 

Q 

R 

PD  Q 

QD  R 

PD  (QD  R) 
U 

PD  R 

UD  {PD  R) 
V 

{PD  Q)D  V 
W 

T 

T 

T 

T 

F 

F 

T 

T 

T 

T 

F 

F 

F 

T 

T 

T 

F 

F 

F 

T 

T 

The  interpretation  of  the  first  line  of  this  is  that,  if  R  is  true,  then  each 
oi  P  D  R,  U  D  (P  D  P),  and  (P  3  Q)  D  V  is  true.  The  corresponding 
statements  of  symbolic  logic  would  be: 


(1) 
(2) 
(3) 


\-R  D  (P  D  R), 

\-  R  D  .U  D  (P  D  R), 

[-RD  .{P  ^  Q)  ^  V, 


i.e.        [-RD  V, 
i.e.        [-R  D  W. 


Let  us  try  to  prove  these.  This  turns  out  to  be  very  easy,  inasmuch  as  we 
have  derived  theorems  which  are  expressly  designed  to  prove  these.  (1) 
follows  by  the  corollary  to  Thm.IV.4.28,  and  (2)  follows  from  (1)  by 
Thm.IV.4.28,  and  (3)  follows  from  (2)  by  Thm.IV.4.28. 

Statements  corresponding  to  the  second  line  of  our  truth-value  table 
would  be: 
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(4)  h  -R-P  '^  (P^  R), 

(5)  h  ~/2~P  D  .U  D  {P  D  R), 

(6)  h  ^R^P  3  .(P  D  Q)  D  V. 

To  derive  (4),  we  first  get  |-  ^R^P  D  ~P  by  Thm.IV.4.18,  and  from  this 
we  get  (4)  by  the  corollary  to  Thm.IV.4.26.    Then  we  get  (5)  from  (4)  and 

(6)  from  (5)  by  Thm.IV.4.28. 

Statements  corresponding  to  the  third  line  of  our  truth-value  table 
would  be: 

(7)  h  '^RP'^Q  D  ~(P  D  Q), 

(8)  h  ^RPr^Q  D  .{P  D  Q)  D  V. 

To  prove  (7),  we  get  | RP^Q  D  P  and  ^  ~PP'~Q  3  ~Q  by  Thm. 

IV.4.18,  and  then  (7)  follows  by  Thm.IV.4.29.  From  (7),  we  get  (8)  by 
the  corollary  to  Thm.IV.4.26. 

Statements  corresponding  to  the  fourth  line  of  our  truth-value  table 
would  be: 

(9)  h  '^RPQ  D  -(Q  3  R), 

(10)  (7  '^RPQ  D  ~(P  D  (Q  D  R)), 

(11)  l-'^RPQ  D  .U  D  (P  D  R), 

(12)  '         h  ^RPQ  D  .(P  D  Q)  D  V. 

To  prove  (9),  we  get  \-  ^RPQ  D  Q  and  [-  -^RPQ  D  ~P  by  Thm.IV.4.18, 
and  then  (9)  follows  by  Thm.IV.4.29.  Now  \-  ~PPQ  D  P  follows  by 
Thm.IV.4.18,  and  from  this  and  (9),  we  get  (10)  by  Thm.IV.4.29.    Then 

(11)  follows  from  (10)  by  the  corollary  to  Thm.IV.4.26,  and  finally  (12) 
follows  from  (11)  by  Thm.IV.4.28. 

By  paralleling  formally  the  reasoning  embodied  in  each  of  the  four  lines 
of  our  truth-value  table,  we  have  proved 

(3)  ^RDW, 

(6)  h  ^R'^P  3  W, 

(8)  h  ^RP^Q  ^  W, 

(12)  [-  ~PPQ  D  W. 

Now  by  (8)  and  (12),  we  infer 

(13)  h  ~PP  D  W 
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by  Thm.IV.4.24.    By  (6)  and  (13),  we  infer 

(14)  \-r^RD  W 

by  Thm.IV.4.24.  Finally,  by  (3)  and  (14),  we  infer  ^Why  Thm.IV.4.23. 
So  we  have  proved : 

Theorem  IV.5.1.     \-  P  D  Q.  ^  -..P  D  .Q  3  R-.  D  .P  D  R. 

Moreover,  we  based  our  proof  on  a  truth-value  table,  making  it  plausible 
that  we  can  base  the  proofs  of  other  statements  on  their  truth-value  tables. 
We  now  wish  to  show  rigorously  that  we  can  indeed  do  so. 

For  actually  carrying  out  a  proof,  the  abridged  truth-value  table  is  much 
shorter,  and  hence  more  convenient.  However,  for  proving  that  we 
always  can  carry  out  a  proof,  the  unabridged  truth-value  table  is  more 
systematic,  and  hence  preferable. 

We  note  that  the  proof  fell  into  two  distinct  phases.  In  the  first  phase, 
we  were  proving  statements  such  as  (1)  through  (12)  corresponding  to 
rows  of  our  truth-value  table.  In  the  second  phase,  we  combined  these 
statements  by  means  of  Thm.IV.4.24  and  Thm.IV.4.23.  Let  us  now 
examine  the  first  phase  more  carefully. 

A  logical  product  such  as  '^RPQ  corresponds  to  a  choice  of  a  set  of 
truth  values  for  P,  Q,  and  R;  in  this  case  the  choice  P  is  T,  Q  is  T,  and 
R  is  F.  Clearly  any  such  logical  product  corresponds  to  a  choice  of  a  set 
of  truth  values,  and  vice  versa. 

Given  this  choice  of  truth  values,  some  formulas,  such  as  U,  take  the 
value  F  and  other  formulas,  such  as  V,  take  the  value  T.  For  a  formula  U 
which  takes  the  value  F,  we  wish  to  prove 

(10)  h  ^RPQ  D  '^U, 

and  for  a  formula  V  which  takes  the  value  T,  we  wish  to  prove 

(11)  h  ~^^Q  ^  y- 

Incidentally,  the  departure  from  alphabetical  order  in  the  product  ^RPQ 
was  adopted  to  fit  the  peculiar  arrangement  of  cases  in  the  abridged 
truth-value  table.  For  a  complete  systematic  listing  of  cases,  such  as 
occurs  in  the  unabridged  truth-value  table,  the  usual  alphabetic  order 
would  be  quite  suitable,  and  we  would  wish  to  prove 

I-  PQ'-^R  D  ~I7 
and 

\-  PQ'^R  D  V. 

These  are  special  cases  of  the  general  result  proved  in  the  theorem  below. 

Theorem  IV.5.2.     Let  Pi,  P2,  ...  ,  Pn  be  statements.     Let  X  be  a 

statement  built  up  from  some  or  all  of  the  P's  by  use  of  &  and  '^,  using 
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each  P  more  than  once  if  desired.  Let  Qi,  Q2,  •  •  •  ,  Qn  be  some  statements 
satisfying  the  condition  that  for  some  (or  no)  i's  Qi  is  Pi,  and  for  the  re- 
maining i's  (if  any)  Q.  is  '^P,.  Let  the  value  T  be  assigned  to  P.  for  those 
i's  (if  any)  for  which  Qi  is  P,-,  and  let  the  value  F  be  assigned  to  Pi  for  the 
remaining  z's  (if  any).  Let  the  corresponding  value  for  X  be  computed  by 
the  method  of  truth-value  tables.    If  the  value  for  X  is  T,  then 

h  Q.Q2    ■■  Q„D  X, 

and  if  the  value  for  X  is  F,  then 

h  Q1Q2  •  •  •  Qn  3  ^X. 

Proof.  Proof  by  induction  on  the  number  of  symbols  in  X,  counting 
each  occurrence  of  '--^  or  a  P  as  a  symbol.  In  other  words,  we  proceed  (as 
in  the  construction  of  truth-value  tables)  from  simple  statements  to  more 
complex  statements.  First,  suppose  there  is  a  single  symbol  in  X.  Then 
X  must  be  some  P,.    By  Thm.IV.4.18, 

(a)  h  Q1Q2  ■'■  QnO  Q<. 
Case  1.     Qi  is  Pi.    Then  (a)  is 

h  Q1Q2  •  •  •  Q„  3  X. 

However,  in  this  case  the  value  T  is  assigned  to  Pj  and  hence  to  X. 
Case  2.     Qi  is  ~P,.    Then  (a)  is 

h  Q1Q2  '•-  QnD  ~X. 

However,  in  this  case  the  value  F  is  assigned  to  P,  and  hence  to  X. 

Now  assume  the  theorem  true  for  k  or  fewer  symbols  in  X  with  k  a 
positive  integer.  We  call  this  assumption  the  hypothesis  of  the  induction. 
Let  X  have  A;  +  1  symbols.  Hence  X  has  at  least  two  symbols.  It  was 
assumed  that  X  was  built  up  by  use  of  &  and  '^.  Hence  the  last  step  in 
building  X  was  either  to  use  an  &  or  to  use  a  ~'. 

Case  1.  The  last  step  was  to  use  an  &.  So  X  is  AB,  where  each  of  A 
and  B  has  k  or  fewer  symbols. 

Subcase  1.     A  has  the  value  T.    Then  by  the  hypothesis  of  the  induction 

(b)  \-Q^Q2^■'  Qn^  A. 

Subsuhcase  i.  B  has  the  value  T  giving  X  the  value  T.  Then  by  the 
hypothesis  of  the  induction, 

(c)  h  Q,Q2  ■■■  Q.D  B. 
Then  by  (b)  and  (c)  and  Thm.IV.4.17, 

h  Q1Q2  ■■■  QnD  AB. 
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This  is 

h  Q1Q2  •  •  •  Q„  3  X. 

However,  this  is  just  what  we  wish,  since  in  this  case  X  takes  the  value  T. 
Suhsuhcase  ii.     B  has  the  value  F,  giving  X  the  value  F.    Then  by  the 
hypothesis  of  the  induction, 

h  Q1Q2  •  •  •  Q„  D  -5. 

So  by  Thm.IV.4.27 

h  Q1Q2  •  •  •  Qn  3  ~(AB). 
This  is 

h  Q1Q2  •  •  •  Q„  3  -X. 

However,  this  is  just  what  we  wish,  since  X  takes  the  value  F  in  this  case. 

Subcase  II.     ^  has  the  value  F,  giving  X  the  value  F.    Then  by  the 
hypothesis  of  the  induction, 

h  Q1Q2  -••  QnD  ^A. 

So  by  Thm.IV.4.26 

h  Q1Q2  •  •  •  Q„  3  ~(^5). 
This  is 

h  Q.Q2  •  •  •  Q„  3  -X, 

which  is  just  what  we  wish  since  X  takes  the  value  F  in  this  case. 

Case  2.     The  last  step  was  to  use  a  '^.    So  X  is  '^C,  and  C  has  A-  symbols. 

Subcase  I.     C  has  the  value  T,  so  that  X  has  the  value  F.    By  the  hy- 
pothesis of  the  induction, 

h  Q1Q2  •  •  •  Q„  3  C. 

So  by  Thm.IV.4.25, 

h  Q1Q2  •  •  •  Q.  3  ~~C. 
This  is    . 

h  Q1Q2  •  •  •  Q„  3  ~X, 

which  is  just  what  we  wish. 

Subcase  II.     C  has  the  value  i^,  so  that  X  has  the  value  T.    By    the 
hypothesis  of  the  induction, 

h  Q1Q2  •  •  •  Q„  3  --(7. 
This  is 

h  Q1Q2  •  •  •  Q„  3  X, 

which  is  just  what  we  wish. 
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This  theorem  takes  care  of  the  first  phase  of  any  proof  based  on  a  truth- 
value  table.  If  each  Q^  is  either  P,  or  '^P.,  then  Q1Q2  •  •  •  Q„  represents  a 
choice  of  a  set  of  truth  values  for  the  P's,  as  indicated  in  the  theorem.  The 
results 

h  Q.Q2  ■■■  Q.D  X 
or 

h  Q1Q2  •  •  •  Q,.  D  ~X 

for  various  X's  correspond  to  the  entries  in  one  row  of  a  truth-value  table. 

Notice  that  in  the  theorem  above  we  assumed  X  to  be  built  up  out  of  & 
and  '^  alone.  That  is,  each  part  of  X  of  the  form  PwQ,  P  D  Q,  or  P  ^  Q 
is  written  in  unabbreviated  form  before  one  starts  to  construct  the  truth- 
value  table.  In  constructing  an  actual  truth-value  table,  this  would  con- 
siderably increase  the  labor  of  construction,  but  for  a  proof  about  truth- 
value  tables  it  allows  considerable  simplification. 

The  proof  of  Thm.IV.5.2  is  a  particularly  interesting  example  of  proof  by 
cases  (see  Principle  5  in  Sec.  5  of  Chapter  II)  in  that  many  of  the  cases  are 
themselves  proofs  by  cases. 

We  claimed  earlier  that  all  our  proofs  should  be  constructive.  Let  us 
inquire  if  the  proof  above  satisfies  this  requirement.  We  claim  to  prove 
either 

h  Q.Q2       "QnD    X 

or 

h  Q1Q2  •  •  •  Q„  D  -X 

for  each  X.  That  is,  we  claim  the  existence  of  a  sequence  of  steps  Si,  S2, 
.  .  .  ,S,  which  is  a  demonstration  of  one  of  the  results  stated.  Have  we  given 
instructions  for  constructing  such  a  sequence  of  steps?  We  believe  that 
we  have. 

To  verify  this,  let  us  refer  back  to  the  proof.  We  first  give  instructions 
for  handling  each  X  consisting  of  only  one  symbol.  In  fact  Thm.IV.4.18 
takes  care  of  this  case.  The  remainder  of  the  theorem  assumes  that  we 
have  available  instructions  for  constructing  demonstrations  for  each  X  of 
k  or  fewer  sj^mbols,  and  furnishes  instructions  for  constructing  demonstra- 
tions for  each  X  of  A;  +  1  symbols.  Now  given  Thm.IV.4.18  which  gives 
instructions  for  all  X's  of  one  symbol,  we  take  k  =  1  and  then  have  instruc- 
tions for  all  X's  of  two  symbols.  Now  we  can  take  k  =  2  and  get  instruc- 
tions for  all  X's  of  three  symbols.  Proceeding  in  this  way,  we  can  build  up 
a  set  of  instructions  for  X's  of  any  desired  degree  of  complexity. 

In  any  particular  case,  the  process  actually  goes  through  quite  easily,  as 
witness  the  first  phase  of  our  proof  of  Thm.IV.5.1. 

We  now  prove  the  truth-value  theorem: 

**Theorem  IV.5.3.     Let  P,,  P2,  .  .  .  ,  P„  be  statements.     Let  X  be  a 
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statement  built  up  from  P^,  P2,  ■  .  .  ,  Pn  by  use  of  &  and  ~,  using  each  P 
more  than  once  if  desired.  Let  X  take  the  value  T  whatever  sets  of  values 
T  and  F  be  assigned  to  P^,  P2,  .  .  .  ,  P„.    Then  \-  X. 

The  first  phase  of  the  proof  was  taken  care  of  by  Thm.IV.5.2,  and  we  now 
need  only  carry  out  the  second  phase  in  the  manner  already  indicated  in 
the  proof  of  Thm.IV.5.1.    By  Thm.IV.5.2,  we  have 

h  Q1Q2  ••QnD  X 

for  each  set  of  Q's  such  that  each  Qi  is  either  Pi  or  ~P..  In  particular, 
taking  Q„  to  be  first  P„  and  then  '^Pn,  we  get  both 

h  Q1Q2  •  •  •  Qn-,Pn     3  Z 

and 

hQlQa  •••  Qn-,-Pn     3  X. 

So  by  Thm.IV.4.24, 

h  Q1Q2  •  •  •  Qn-l  3  X. 

We  now  repeat  this  reasoning,  letting  Q„_-i  be  first  P„-i  and  then  '~P„_i, 
and  using  Thm.IV.4.24  again  to  infer 

h  Q1Q2    •  •  •    Qr.-2    3    X. 

We  continue  in  this  way  down  to 

\-Q,DX. 
Letting  Qi  be  first  P]  and  then  '^Pi,  we  get 

\-P,  D  X        and         h  '^^1  ^  ^• 

Finally,  we  use  Thm.IV.4.23. 

Various  comments  about  this  theorem  are  in  order.  In  the  first  place, 
we  are  now  free  to  make  use  of  any  statement  which  always  takes  the  value 
T.  Moreover,  if  X  always  takes  the  value  T,  one  can  derive  X  from  the 
truth-value  axioms  alone  by  use  of  modus  ponens.  One  could  verifj'-  this 
by  checking  through  the  proofs  of  all  the  theorems  up  to  and  including 
Thm.IV.5.3.  Or  one  can  observe  it  more  easily  by  noting  that,  since  only 
the  truth-value  axioms  have  been  listed  so  far,  no  other  axioms  could  have 
been  used  so  far. 

EXERCISES 

IV.5.1.     Using  only  results  from  Sec.  4,  prove: 

(a)     h  -PQP  D  -((P  ^  Q)  ^  R). 
(h)    h  ~PQ~P  D  ((P  ^Q)  ^  R). 


Sec.  6]  AXIOMATIC  TREATMENT  75 

IV.6.2.  We  say  that  our  symbolic  logic  is  inconsistent  if  there  is  a 
statement  P  such  that  \-  P  and  |-  '^P.  Prove  that,  if  our  symbolic  logic  is 
inconsistent,  then  |-  Q  for  every  statement  Q. 

6.  The  Deduction  Theorem.  In  this  section  we  prove  the  important 
theorem  that  ii  Q  \-  R  then  \-  Q  D  R.  Actually,  we  prove  a  generalized 
form  of  it  to  the  effect  that,  if  Pi,  P2,  •  •  .  ,  P„,  Q  \-  R,  then  Pi,  P2,  .  .  .  , 
Pn\-  Q  ^  R-    This  result  will  be  called  the  deduction  theorem. 

The  reader  may  wonder  how  we  can  prove  this  theorem  before  we  have 
stated  our  complete  set  of  axioms.  If  we  have  Q  \-  R,  this  may  make  use 
of  axioms  which  we  have  not  yet  stated.  In  such  a  case,  not  knowing  all 
the  axioms  used  mQ\-  R,  how  can  we  go  about  proving  \-Q  D  R?  Actually, 
there  is  no  difficulty,  since  it  turns  out  that  the  use  of  axioms  m\-  Q  D  R 
exactly  parallels  that  in  Q\-  R.  Hence  it  suffi.ces  to  know  that  we  have  the 
same  set  of  axioms  available  in  each  case,  and  the  exact  forms  of  the 
axioms  are  of  no  consequence,  provided  only  that  our  axioms  are  adequate 
to  prove 

hP  DP, 

\-PD  (QD  P), 


and 


[-  P  D  Q.  D  :.P  D  .Q  D  R:  D  .P  D  R. 


As  these  are  Thm.IV.4.12,  the  corollary  to  Thm.IV.4.28,  and  Thm.IV.5.1 
our  set  of  axioms  does  satisfy  the  proviso. 

We  now  prove  the  deduction  theorem. 

**Theorem  IV.6.1.  If  Pi,  P^,  ...  ,  P„,  Q.[-  R,  then  P„  P^,  .  .  .  ,  P„  [- 
QD  R. 

Proof.  Assume  that  we  have  given  a  demonstration  Si,  S2,  .  .  .  ,  S,  of 
Pi,  P2,  .  .  .  ,  P„,  Q\-  R.  Then  we  wish  to  show  how  to  construct  a  demon- 
stration of  Pi,  P2,  .  .  .  ,  P„  r  Q  D  R.  By  the  definition  of  \-,  we  know  that 
Ss  is  R  and  for  each  *S,  either: 

(1)  Si  is  an  axiom. 

(2)  Si  is  a  P  or  is  Q. 

(3)  There  is  a  j  less  than  i  such  that  Si  and  Sj  are  the  same. 

(4)  There  are  j  and  k,  each  less  than  i  such  that  Sk  is  *S,  D  5,-. 

We  take  Q  D  Si,  Q  D  S2,  .  ■  .  ,  Q  3  aS^  to  be  key  steps  of  our  demonstra- 
tion of  Pi,  P2,  .  .  .  ,  P„  I"  Q  D  R.  The  last  of  our  key  steps,  Q  D  S,,  is 
Q  D  P,  as  desired.  By  filling  in  additional  steps  before  each  of  our  key 
steps,  we  can  build  up  a  complete  demonstration.    We  do  this  as  follows. 

Case  1.  Si  is  an  axiom.  Then  before  the  key  step  Q  D  Si  we  insert  the 
following  steps :   First  a  demonstration  of  |-  Si  D  (Q  D  Si)  (see  the  corollary 
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to  Thm.IV.4.28),  and  then  the  step  Si.  From  these,  one  can  proceed  to 
the  key  step  Q  D  Si  by  modus  ponens. 

Case  2.     *S,-  is  a  P.    We  proceed  as  in  Case  1. 

Case  3.  Si  is  Q.  Then  we  insert  a  demonstration  oi  \-  Q  D  Q  (see 
Thm.IV.4.12).  As  Si  is  Q,  the  key  step  Q  D  Si  is  &  repetition  of  the  last 
step  of  the  demonstration  oi  [-  Q  D  Q. 

Case  4.  Si  is  the  same  as  an  earher  S,-.  Then  the  key  step  Q  D  Si  is  the 
same  as  an  earher  key  step  Q  D  Sj  and  we  need  not  insert  any  extra  steps 
before  Q  D  Si. 

Case  5.  There  are  an  earher  Sj  and  an  earher  Sk  such  that  Sk  is  S,-  D  Si. 
That  is,  Si  and  Sj  D  Si  occur  before  Si.  Then  the  key  steps  Q  D  S,-  and 
Q  D  {Sj  D  Si)  have  aheady  occurred.  We  insert  a  demonstration  of 
^Q  D  Sj.  D  :Q  D  {Si  D  Si).  D  .Q  D  Si  (see  Thm.IV.5.1).  By  modus 
ponens  with  the  earher  key  step  Q  D  >S,,  we  can  justify  the  step  Q  D 
{Sj  D  Si).  D  .Q  D  >Si,  which  we  insert.  By  modus  ponens  with  the  earher 
key  step  Q  D  (Sj  D  Si)  we  can  justify  the  key  step  Q  D  Si. 

In  Sec.  4  of  Chapter  II,  we  made  frequent  mention  of  the  principle  that, 
if  one  assumes  P  and  correctly  deduces  Q,  then  one  can  infer  P  D  Q.  The 
justification  of  this  is  carried  out  in  two  steps.  The  first  step  is  to  show  that, 
in  any  case  where  one  assumes  P  and  correctly  deduces  Q,  the  deduction 
can  be  thrown  into  a  form  which  is  a  demonstration  oi  P  \-  Q.  Then  the 
second  step  is  to  infer  \-  P  D  Q  by  the  deduction  theorem. 

The  first  step  can  never  be  conclusively  established,  because  the  notion 
of  "correct  deduction"  is  not  precisely  defined.  In  fact,  one  of  our  aims  in 
setting  up  a  system  of  symbolic  logic  is  to  give  a  precise  definition,  and  we 
intend  to  take  P  |-  Q  as  a  precise  definition  of  "Q  can  be  correctly  deduced 
from  P."  It  is  our  intention  in  the  succeeding  chapters  to  consider  a  great 
many  instances  in  which  it  is  commonly  agreed  that  one  has  a  correct 
deduction  of  Q  from  P,  and  to  show  in  each  of  these  cases  that  P  \-Q.  We 
shall  exhibit  enough  instances  that  we  hope  that  the  weight  of  the  evidence 
will  impel  a  belief  on  the  part  of  the  reader  that  the  precise  idea  P  h  Q  is 
indeed  an  adequate  approximation  to  the  less  precise  idea  "Q  can  be 
correctly  deduced  from  P." 

EXERCISES 
IV.6.1.     Prove  that  P  \- Q  ii  and  only  ii  ^  P  D  Q. 


CHAPTER  V 
CLARIFICATION 

In  Chapter  I  we  stated  our  aims  in  very  general  terms.  We  are  now  in  a 
position  to  be  more  explicit.  We  now  have  shown  the  beginnings  of  a 
system  of  symbolic  logic  and  have  shown  how  this  system  may  be  used  as 
a  model  for  mathematical  reasoning.  Hence  we  believe  it  worth  while  to 
pause  and  restate  our  intentions  more  carefully. 

There  is  an  intuitive  notion  of  logical  correctness  which  is  shared  by  the 
majority  of  mathematicians.  There  will  be  disagreement  on  some  of  the 
more  abstruse  principles,  such  as  the  axiom  of  choice,  but  if  we  confine  our 
attention  to  the  more  basic  principles,  there  is  general  agreement.  It  is 
our  intention  to  give  a  precise  definition  of  these  basic  principles  by  a 
mechanical  system  known  as  symbolic  logic. 

Note  that  we  make  no  claim  to  justify  the  basic  principles;  we  claim 
only  that  we  define  them  with  precision. 

We  might  as  well  be  frank  and  admit  that  some  of  these  basic  logical 
principles  used  by  all  (or  almost  all)  mathematicians  may  actually  not  be 
valid.  There  is  strong  evidence  for  their  validity  in  that  they  are  in 
common  use  by  mathematicians,  physicists,  chemists,  engineers,  and  tech- 
nical men  generally,  and  the  results  obtained  by  them  not  only  are  quite 
satisfactory  but  are  useful  almost  to  the  point  of  being  indispensable. 
However,  this  evidence  is  not  conclusive.  It  is  a  matter  of  history  that 
infinite  series  (including  even  divergent  ones)  were  at  one  time  generally 
dealt  with  according  to  principles  which  are  now  believed  to  be  incorrect. 
Nonetheless,  at  the  time,  the  results  obtained  by  use  of  these  (incorrect) 
principles  were  satisfactory  and  useful. 

Actually,  some  of  the  basic  principles  in  common  use  have  been  subjected 
to  severe  criticism,  notably  certain  uses  of  reductio  ad  absurdum.  We  do 
not  presume  to  make  a  judgment  in  this  matter.  We  include  reductio  ad 
absurdum  in  our  symbolic  logic  (see  Sec.  4  of  Chapter  II),  but  we  do  so 
merely  because  practically  every  mathematician  uses  reductio  ad  absurdum 
freely,  and  not  at  all  because  we  are  able  to  justify  its  use. 

To  reiterate,  we  shall  set  up  a  system  of  symbolic  logic  which  will  define 
precisely  the  basic  logical  principles  which  are  in  common  use.  The 
system  of  symbolic  logic  will  in  no  way  justify  these  principles.  In  fact, 
because  there  is  no  guarantee  that  all  the  principles  defined  by  the  symbolic 
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logic  are  valid,  there  is  equally  no  guarantee  that  the  symbolic  logic  is  itself 
valid.  In  fact,  it  is  perfectly  possible  that  the  symbolic  logic  contains 
a  contradiction;  that  is,  there  may  be  a  statement  P  such  that  both  |-  P 
and  j-  '^P.  If  this  were  found  to  be  the  case,  we  should  certainly  have  to 
abandon  the  present  symbolic  logic  and  seek  another. 

The  reader  may  inquire  why  we  set  up  a  symbolic  logic  embodying  logical 
principles  whose  sole  justification  is  that  of  widespread  use.  Why  not 
instead  set  up  a  symbolic  logic  embodying  only  genuinely  valid  principles? 
We  would  if  we  could,  but  we  don't  know  how.  We  know  of  no  way  of 
deciding  for  sure  if  a  given  logical  principle  is  valid.  The  best  criterion  we 
have  found  as  yet  for  the  validity  of  a  logical  principle  is  that  of  widespread 
acceptance  by  careful  mathematicians.  This  is  admittedly  inconclusive, 
so  that  we  have  to  admit  the  possibility  that  the  logical  principles  defined 
by  our  symbolic  logic  may  be  invalid.  Our  justification  for  using  these 
principles  while  retaining  doubts  of  their  validity  may  be  summarized  in 
the  following  considerations: 

A.  None  of  these  principles  is  now  known  to  be  invalid  (although  some 
are  under  suspicion) . 

B.  From  these  principles  one  can  derive  the  existing  body  of  mathe- 
matics, which  is  very  useful  in  the  sciences  and  engineering,  and  even  iji 
daily  life. 

Let  us  now  look  more  carefully  at  the  actual  structure  of  our  symbolic 
logic.  To  make  it  as  precise  as  possible,  we  have  made  it  as  mechanical  as 
possible.  All  reference  to  meaning  is  avoided,  and  only  the  forms  of  state- 
ments are  considered.  For  this  reason  almost  no  intelligence  is  required  to 
check  a  proof  within  the  symbolic  logic,  where  by  a  proof  we  mean  an 
actual  demonstration  of  |-  P.  We  cannot  avoid  the  use  of  a  minimum  of 
intelligence.  Thus,  if  Si,  S2,  .  .  .  ,  S,  is  proposed  as  a  demonstration  of  |-  P, 
it  is  required  in  order  to  check  this  that  we  be  able  to  decide  that  some  of 
the  S's  are  axioms  and  that  the  remaining  S's  follow  from  previous  S's  by 
modus  ponens  (or  are  the  same  as  some  previous  *S's).  For  this  we  have 
to  be  able  to  recognize  different  occurrences  of  a  statement  (name  of  a 
statement,  actually)  as  occurrences  of  the  same  statement  (name),  to  be 
able  to  replace  occurrences  of  one  statement  (name)  by  occurrences  of 
another  statement  (name),  to  be  able  to  associate  together  properly  the 
left  and  right  ends  of  pairs  of  parentheses  in  a  complex  statement,  etc. 
The  intelligence  required  to  handle  these  matters  is  very  slight;  so  that, 
while  we  have  not  completely  eliminated  intelligence,  we  have  certainly 
reduced  the  need  for  it  to  the  point  where  it  seems  justified  to  claim  that 
the  checking  procedure  is  purely  mechanical.  That  is,  we  have  a  purely 
mechanical  equivalent  of  mathematical  reasoning. 

In  the  present  text,  we  shall  not  devote  ourselves  exclusively  to  operating 
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within  the  symbolic  logic.  Since  we  wish  to  make  it  seem  probable  that  the 
symbolic  logic  is  a  precise  definition  of  the  intuitive  notion  of  logical  correct- 
ness, we  shall  be  presenting  a  great  many  instances  of  ordinary  mathe- 
matical reasoning  for  comparison  with  the  symbolic  logic.  This  has 
already  happened  in  Sec.  4  of  Chapter  II.  Also,  in  setting  up  the  symbolic 
logic,  we  shall  proceed  by  analogy  with  the  intuitive  logic,  and  so  shall  need 
to  have  it  before  us.  This  occurred  in  Sec.  1  of  Chapter  II  and  at  places  in 
Sec.  3  of  Chapter  II.  Thus  we  shall  expect  that  in  general  we  shall  be  con- 
sidering simultaneously  two  different  logics,  the  intuitive  everyday  logic  and 
a  mechanical  symbolic  logic  which  we  are  comparing  with  it. 

Still  another  complicating  factor  enters  the  picture  as  follows.  The  state- 
ment \-  P  means  that  there  is  a  demonstration  whose  last  step  is  P.  To 
prove  \-  P  we  are  supposed  to  exhibit  this  demonstration,  whereupon  it  is  a 
purely  mechanical  procedure  to  check  that  it  is  a  demonstration.  In 
practice,  this  ideal  arrangement  cannot  be  carried  out,  simply  because  the 
demonstrations  become  so  extremely  long  as  to  be  completely  unmanage- 
able. Thus,  in  the  proofs  of  Thms.IV.4.1  and  IV.4.2,  we  actually  exhibited 
the  demonstrations,  but  pure  lack  of  space  soon  forced  us  to  abandon  this 
procedure.  If  the  reader  doubts  this,  let  him  work  Ex.  IV.4.4  and  IV.4.5. 
Accordingly,  we  were  compelled  to  seek  other  means  of  proving  statements 
of  the  form  |-  P.  There  seems  no  way  of  doing  this  without  introducing 
still  a  third  logic.  In  other  words  we  have  to  deal  simultaneously  with  the 
ordinary  intuitive  logic  of  mathematics,  with  a  mechanical  symbolic  logic, 
and  with  a  constructive  intuitive  logic  which  we  use  to  prove  things  about 
the  symbolic  logic.  If  we  should  take  this  third  logic  to  be  just  the  ordinary 
intuitive  logic  of  mathematics,  then  we  could  hardly  claim  that  we  are 
framing  a  definition  of  the  ordinary  intuitive  logic  of  mathematics.  Be- 
sides, since  we  do  not  wish  to  use  this  third  logic  and  are  forced  to  only  by 
limitations  of  time  and  space,  we  wish  to  keep  this  third  logic  to  a  minimum. 
Further,  we  desire  that  the  third  logic  shall  have  a  maximum  of  trust- 
worthiness, so  that  if  we  claim  to  prove  |-  P  there  will  not  be  anyone  who 
will  question  our  proof.  In  order  to  ensure  this,  we  insist  that  we  shall 
always  either  write  out  a  demonstration  in  full,  as  in  the  proofs  of  Thms. 
IV.4.1  and  IV.4.2,  or  else  give  explicit  instructions  whereby  one  could 
write  out  a  demonstration.  In  some  of  the  proofs,  as,  for  instance,  the 
proof  of  Thm.  IV.5.2,  we  use  proof  by  mathematical  induction  on  n. 
This  assumes  whatever  simple  properties  of  positive  integers  are  embodied 
in  the  principle  of  mathematical  induction.  Even  in  these  more  intricate 
proofs,  we  still  are  adhering  to  our  requirement  that  we  must  give  explicit 
instructions.  In  our  induction  proofs,  we  give  explicit  instructions  for 
n  =  1,  and  then,  assuming  instructions  available  for  n  <  k  with  k  a  positive 
integer,  we  write  instructions  f  orn  =  fc  +  1 .    Then  for  any  finite  value  of  n, 
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one  could  write  out  instructions  by  working  up  from  n  =  1,  letting  k  be 
successively  1,  2,  .  .  .  ,  w  —  1.  Similar  remarks  hold  for  the  proofs  by  cases 
which  we  sometimes  use.  We  use  proof  by  cases  only  when  we  have  infor- 
mation which  guarantees  that  our  listing  of  cases  is  quite  exhaustive.  Then 
for  each  case,  appropriate  instructions  are  given. 

Thus  we  restrict  our  proofs  to  those  which  give  quite  positive  evidence 
of  the  existence  of  demonstrations,  and  thereby  justify  a  high  degree  of 
reliance  on  our  conclusions. 

To  make  clear  just  what  is  involved,  suppose  we  should  relax  our  require- 
ments and  permit  proofs  by  reductio  ad  absurdum  of  some  of  our  theorems. 
Say,  for  instance,  that  we  should  assume  that  no  demonstration  exists  for 
some  statement,  and  then  deduce  a  contradiction  from  our  assumption. 
Most  mathematicians  would  thereby  be  convinced  of  the  existence  of  a 
demonstration  for  that  statement.  However,  not  all  would  be  convinced. 
Some  mathematicians  deny  the  validity  of  the  principle  of  reductio  ad 
absurdum  in  such  a  situation.  They  would  insist  that,  until  we  had  either 
shown  them  a  demonstration  or  given  them  explicit  instructions  for  writing 
out  a  demonstration,  we  could  not  really  guarantee  the  existence  of  a 
demonstration.  We  avoid  such  objections  by  restricting  our  proofs  to 
those  which  give  direct  positive  evidence  of  the  existence  of  demonstrations. 

Let  us  repeat  the  relationship  of  our  three  logics.  There  is  the  purely 
mechanical  symbohc  logic.  It  is  operated  without  reference  to  meaning. 
If  we  wish  to  prove  |-  P  for  some  P,  we  must  do  so  purely  by  reference  to 
the  form  of  P.  To  prove  j-  P  by  the  purely  mechanical  procedure  pre- 
scribed by  our  symbolic  logic  is  in  general  much  too  long  a  process  to  be 
practicable.  So  we  have  to  permit  the  use  of  a  shght  amount  of  intuitive, 
nonmechanical  reasoning,  to  keep  our  proofs  of  such  results  as  |-  P  within 
bounds.  We  insist  that  this  reasoning  be  kept  simple,  direct,  and  con- 
structive. Be  it  noted  that  our  proofs  oi  \-  P  still  depend  purely  upon  the 
form  of  P  and  not  upon  its  meaning.  Thus  it  is  genuinely  possible  to  keep 
our  reasoning  about  the  logic  very  simple  and  direct.  The  connection  with 
the  everyday  logic  of  mathematics  is  as  follows.  Suppose  we  have  proved 
\-  P,  either  quite  mechanically,  or  by  use  of  some  simple  reasoning  based 
entirely  on  the  form  of  P.  We  now  give  a  meaning  to  P  by  interpreting 
the  symbols  occurring  therein  (for  instance,  &  is  interpreted  to  be  "and"). 
This  meaning  will  turn  out  to  be  a  principle  of  the  everyday  logic  of  mathe- 
matics. At  least,  it  always  has  so  far  in  our  experience.  Conversely,  given 
a  basic  principle  of  the  everyday  logic  of  mathematics,  we  have  so  far  been 
able  to  find  a  P  which  expresses  it,  and  such  that  |-  P.  Needless  to  say, 
it  is  no  accident  that  this  happens.  We  have  taken  great  pains  to  choose 
our  axioms  so  as  to  cause  this  to  happen  in  general.  We  have  listed  a  large 
number  of  instances  to  persuade  the  reader  that  this  happens  in  general. 
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Finally,  trusting  that  it  does  happen  in  general,  we  formulate  our  precise 
definition  of  the  basic  everyday  logic  of  mathematics  by  saying  that  it 
consists  of  just  those  logical  principles  which  are  meanings  of  P's  for  which 

\-P- 

Needless  to  say,  the  meaning  of  P  will  not  have  any  connection  with  the 
reasoning  by  which  we  prove  |-  P,  since  the  latter  must  depend  entirely  on 
the  form  of  P. 

EXERCISES 

V.1.1.     If  the  symbolic  logic  which  we  are  studying  is  inconsistent,  then: 

(a)  What  can  one  conclude  about  the  everyday  logic  of  mathematics? 

(b)  What  can  one  conclude  about  the  simple  intuitive  logic  in  which  we 
prove  things  about  our  symbolic  logic? 


CHAPTER  VI 

THE  RESTRICTED  PREDICATE  CALCULUS 

1.  Variables  and  Unknowns.  To  the  layman,  the  trade-mark  of  a 
mathematician  is  his  use  of  the  symbol  x  to  denote  an  unknown  quantity. 
However,  the  letters  x,  y,  z,  m,  n,  etc.,  are  used  not  only  to  denote  unknowns, 
but  also  to  denote  variables.    Thus  we  may  write 

(VI.1.1)  x^  -  4:x  -\-  3  =  0, 

in  which  x  denotes  an  unknown  quantity  whose  value  is  to  be  determined, 
or  we  may  write 

(VI. 1.2)  sin^'x  +  cos^x  =  1, 

in  which  x  denotes  a  variable  for  which  we  may  substitute  any  angle  (or 
any  real  number,  or  even  any  complex  number).    We  may  also  write 

(VI. 1.3)  fx'dx^j, 

(VI. 1.4)  [   x'  dx  =  9, 

Jo 

(VI. 1.5)  f^   y"dy  =  j, 

and  even 

(VI. 1.6)  J%-'dT  =  |- 

In  these,  x  probably  denotes  an  unknown  (or  indeterminate)  in  (VI.  1.5), 
but  a  variable  in  the  others.  However,  there  is  not  universal  agreement  on 
this  point.  Usually,  in  (VI.  1.4)  one  speaks  of  x  as  the  variable  of  integra- 
tion, but  some  writers  insist  that  x  does  not  really  denote  a  variable  (or 
unknown  either)  in  (VI.  1.4),  and  that  (VI.  1.4)  is  merely  a  conventionalized 
abbreviation  for  a  very  complicated  definition.  In  any  case,  in  (VI.  1.3)  to 
(VI.  1.6),  we  certainly  have  at  least  two  distinct  usages  of  the  letter  x  and 
perhaps  as  many  as  four  distinct  usages.  To  make  matters  still  more 
confusing,  in  (VI.  1.6),  two  of  the  occurrences  of  x  are  being  used  in  one  way 
and  the  other  two  in  quite  another  way.    Thus  we  can  replace  two  of  the  ;r's 
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in  (VI. 1.6)  by  3's,  and  obtain  (VI. 1.4),  but  we  cannot  replace  the  other  two 
x's  by  3's,  since  this  would  result  in  the  nonsense 


/: 


3'  d3  =  I 


nonetheless,  we  can  replace  them  by  y's  to  get  (VI.  1.5). 

To  cite  still  other  uses  of  the  letter  x,  we  note  that,  when  we  speak  of  x^ 
as  the  result  of  multiplying  x  by  x,  we  are  using  x  to  denote  an  indetermi- 
nate, but  when  we  speak  of  the  function  x^,  we  are  using  x  to  denote  a 
variable.    In  the  point  slope  form  of  the  equation  of  a  line 

(VI. 1.7)  y  -  Vo  =  rn{x  -  Xo), 

the  X  and  y  denote  variables  which  are  the  coordinates  of  a  variable  point 
on  the  line  (with  the  added  complication  that  y  may  happen  not  to  vary  in 
case  m  =  0) ,  whereas  x^  and  yo  denote  constants  which  are  the  coordinates 
of  a  fixed  (but  unspecified)  point. 

There  are  still  other  uses  of  x  in  mathematics,  but  they  are  less  common 
and  can  be  dispensed  with. 

Additional  complications  in  the  usages  connected  with  x  arise  from  care- 
lessness with  the  use  of  names.  Thus  one  otten  says  "x  is  a  variable"  or 
''re  is  an  unknown."  Actually,  of  course,  x  is  the  twenty-fourth  letter  of  the 
English  alphabet  and  is  temporarily  being  used  as  a  name  of  a  variable 
or  of  an  unknown,  and  a  better  terminology  would  be  "a;  denotes  a  variable" 
or  "x  denotes  an  unknown."  In  the  main  we  have  tried  to  use  the  more 
careful  terminology,  but  on  occasion  we  follow  a  less  accurate  wording 
when  the  more  accurate  one  sounds  awkward. 

In  our  symbolic  logic  we  shall  make  use  of  letters  such  as  x,  y,  z  in 
manners  analogous  to  many  of  the  uses  listed  above.  However,  we  shall 
be  more  systematic  and  shall  introduce  various  special  notations  to  dis- 
tinguish different  uses  of  the  letters.  Such  use  of  different  notations  for 
different  uses  occurs  to  a  limited  extent  in  everyday  mathematics.  Thus, 
in  analytic  geometry  and  calculus,  we  have  the  convention  that  letters 
from  the  early  part  of  the  alphabet  shall  denote  constants  whereas  letters 
from  the  latter  part  of  the  alphabet  shall  denote  variables.  Also,  in  ele- 
mentary algebra  some  writers  distinguish  between  identities,  which  they 
write  with  three  bars,  and  equations,  which  they  write  with  two  bars. 
Thus  they  write 

(VI. 1.8)  x'  -  y'  ^  {x  +  y){x-y) 

but 

a:'  -  4x  +  3  =  0, 
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it  being  indicated  thereby  that  in  the  first  statement  x  and  y  denote  vari- 
ables whose  values  run  over  all  real  (or  complex)  numbers,  whereas  in  the 
second  statement  x  denotes  an  unknowTi  but  fixed  number  whose  value  is 
to  be  determined. 

We  shall  defer  many  uses  of  the  letter  x  to  later  chapters  and  in  the 
present  chapter  shall  study  its  use  as  an  unknown  (or  indeterminate)  and 
as  a  variable  in  the  particular  sense  exemplified  in  the  three-bar  equivalence 
of  elementary  algebra  (see  equation  (VI.  1.8)  above).  The  study  of  these 
two  uses  constitutes  the  restricted  predicate  calculus. 

Before  beginning  to  develop  the  treatment  of  these  in  symbolic  logic,  let 
us  consider  in  more  detail  their  use  in  everyday  mathematics. 

We  cite  two  instances  of  the  use  of  variables, 

(VI. 1.2)  sin'x  +  cos' re  =  1, 

(VI.1.8)  x'  -y'  ^  (x  +  yXx-y), 

and  two  instances  of  the  use  of  unknowns  (or  indeterminates), 

(VI. 1.1)  x^-4x  +  3  =  0, 

(VI. 1.9)  x^  +  y  =  x  +  y\ 

Many  philosophers  and  logicians  object  to  the  use  of  unkno-^ns  in 
statements  on  the  ground  that  it  is  diffi,cult  (if  not  impossible)  to  assign  any 
proper  meaning  to  such  statements.  Mathematicians  have  never  let  such 
considerations  deter  them  from  making  common  use  of  unknowns  in 
statements.  In  our  symbolic  logic,  we  insist  on  paying  no  attention  to 
meanings  and  so  need  not  feel  the  slightest  hesitation  on  this  score  about 
using  unknowns.  Accordingly,  we  shall  use  unknowns  (or  indeterminates) 
in  much  the  way  that  they  are  used  in  everyday  mathematics. 

In  many  cases,  statements  refer  to  variables  or  unknowns  without  using 
letters  for  them,  such  as : 

"If  two  functions  are  continuous  at  a  point,  their  sum  is  continuous  at 
this  point." 

This  concerns  three  unknowns,  namely,  the  two  unspecified  functions  and 
the  unspecified  point,  as  will  be  clearer  if  one  writes  the  statement  in  the 
alternative  form : 

"If  the  functions  /  and  g  are  continuous  at  a  point  x,  then  the  function 
f  -\-  g  is  continuous  at  the  point  x." 

As  used  in  the  three-bar  equivalence  of  elementary  algebra,  a  variable  x 
denotes  a  quantity  which  varies  over  some  range.  It  is  permitted  that  one 
may  substitute  for  a  variable  any  value  in  its  range.  This  is  clearly  exem- 
plified in  (VI.  1.2)  and  (VI.1.8).  In  many  uses  of  variables,  this  property 
is  not  satisfied,  as  in  (VI. 1.4),  for  instance,  where  one  may  not  substitute 
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3  for  X.  This  is  probably  the  reason  why  some  writers  are  reluctant  to 
consider  a;  as  a  variable  in  (VI.  1.4).  Certainly  if  x  is  a  variable  in  (VI.  1.4), 
it  is  a  different  kind  of  variable  from  that  which  we  are  presently  consider- 
ing. Accordingly,  it  will  be  handled  in  our  symbolic  logic  in  quite  a  differ- 
ent way  from  the  way  we  shall  handle  the  x's  of  (VI.  1.2)  and  (VI.  1.8),  and 
will  not  be  considered  until  a  later  chapter. 

In  distinction  to  a  variable,  an  unknown  (or  indeterminate)  is  not  sup- 
posed to  vary.  Thus,  suppose  one  starts  out  to  solve  for  x  in  (VI.  1.1).  If  x 
is  allowed  to  vary,  so  that  the  value  of  x  in  one  step  of  the  solution  is  not 
the  same  as  the  value  in  the  next  step,  then  we  shall  not  be  able  to  justify 
the  next  step.    Thus  we  can  proceed  from 

a;'  -  4x  +  3  =  0 
to 

{x  -  3)(.T  -  1)  =  0 
to 

x  =  3         or        X  =  \ 

only  because  in  each  step  we  have  the  same  value  for  x  as  in  the  preceding 
step. 

To  cite  a  more  complicated  case,  one  may  look  at  the  proof  on  pages 
14  to  15  of  Bocher,  1907,  of:' 

"Theorem  1.  If  two  functions  are  continuous  at  a  point,  their  sum  is 
continuous  at  this  point." 

As  we  remarked  earlier,  this  is  a  statement  about  two  unknown  functions 
and  an  unknown  point.    Indeed,  the  proof  begins:' 

"Let  /i  and  /z  be  two  functions  continuous  at  the  point  (ci,  .  .  .  ,  c„)  .  .  .  ." 

A  hasty  survey  of  Bocher's  proof  will  indicate  that  it  would  break  down 
completely  if  one  should  permit  a  change  to  some  other  point  or  some  other 
pair  of  functions  part  way  through  the  proof.  That  is,  the  point  and  func- 
tions, though  unknown,  are  not  variable. 

Nevertheless,  there  are  cases  in  which  we  permit  an  unknown  to  become 
a  variable,  and  indeed  this  occurs  in  a  very  important  logical  principle. 
To  get  an  instance  of  this,  let  us  follow  the  train  of  thought  which  Bocher 
started  when  he  proved  the  theorem  stated  above.    He  next  proves:' 

"Theorem  2.  If  two  functions  are  continuous  at  a  point,  their  product 
is  continuous  at  this  point." 

From  these,  Bocher  infers  that  each  poljniomial  is  continuous  at  each 
point.    He  then  concludes:' 

"Theorem  3.  Any  poljoiomial  is  a  continuous  function  for  all  values  of 
the  variables." 

1  From  Bocher,  "Introduction  to  Higher  Algebra,"  copyright  1907  by  The  Macmillan 
Company  and  used  with  their  permission. 
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In  this  result,  our  point  has  ceased  being  an  unknown  and  has  become  a 
variable.  That  is,  a  heretofore  fixed  point  is  now  allowed  to  vary.  How 
does  this  come  about?  Bocher  does  not  stop  to  explain,  because  this  is  a 
standard  logical  procedure,  but  had  he  undertaken  to  explain,  he  would 
likely  have  given  some  such  explanation  as  the  following : 

"We  have  sho\vn  that  for  a  fixed  polynomial  and  a  fixed  point,  the  poly- 
nomial is  continuous  at  the  point.  Now  let  us  have  a  fixed  polynomial  and 
a  variable  point.  To  show  that  the  polynomial  is  continuous  at  the  point, 
choose  a  value  for  the  point.  Holding  this  value  temporarily  fixed,  we  con- 
clude that  the  polynomial  is  continuous  at  that  value.  But  since  this  was 
any  value,  we  conclude  that  the  polynomial  is  continuous  at  all  values." 

Putting  this  principle  in  general  terms,  it  says  that,  if  one  can  prove  a 
statement  about  an  unknown,  then  one  can  replace  the  unknoA\Ti  by  a 
variable.  The  justification  is  that,  whatever  value  be  chosen  for  the 
variable,  one  can,  for  that  value,  carry  out  the  proof  given  for  the  unkno^vn 
and  hence  infer  the  truth  of  the  theorem  for  that  value.  This  being  so  for 
each  value,  we  infer  the  theorem  for  all  values. 

This  principle  is  widely  used.  For  most  theorems  involving  variables, 
the  proof  is  usually  given  with  an  unknown  in  place  of  the  variable.  Thus 
to  prove 

sin^  X  +  cos^  a;  =  1, 

one  chooses  some  unknoAvn  (but  fixed)  x  and  proves  the  theorem  for  that  x. 
One  then  changes  to  a  variable  x. 

To  cite  other  instances,  consider  a  typical  proof  from  plane  geometry. 

We  quote  (a  bit  freely)  from  Wentworth  and 
C  Smith,  page  32.' 

/\\  In  an  isosceles  triangle  (see  Fig.  VI .  1 . 1 )  the 

».  angles  opposite  the  equal  sides  are  equal. 

\  Given:     The  isosceles  triangle  ABC,  with 

\  AC  equal  to  BC. 

\  To  prove:     That  ZA  =   ZB. 

\  Proof.     Suppose  CD  drsiwa  so  as  to  bisect 

\  /.ACB.    Then  in  the  triangles  ADC  and 

A      BDC,  AC  =  BC  (given),  CD  =  CD  (iden- 

A  iJ  ^     thy),  and  ZACD  =  ZDCB  (construction). 

Fig.  VI. 1.1.  Hence  triangle  ADC  is  congruent  to  triangle 

BDC,  and  so  LA  =   AB.    Q.E.D. 

Clearly,  it  would  ruin  this  proof  completely  if  halfwaj-  through  it  one 

should  permit  the  triangle  ABC  to  change  into  some  other  isosceles  triangle, 

1  From  "Plane  and  Solid  Geometry"  by  George  Wentworth  and  D.  E.  Smith,  copy- 
right 1913,  courtesy  of  Ginn  &  Company. 
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say  one  in  which  sides  AB  and  BC  were  equal  instead  of  sides  AC  and  BC. 
So  the  proof  is  carried  out  for  some  fixed  (but  undetermined)  isosceles 
triangle,  and  only  after  the  proof  is  complete,  and  the  theorem  proved,  do 
we  permit  the  replacement  of  our  fixed,  unknown  triangle  by  a  variable 
triangle. 

This  is  the  important  consideration.  It  is  only  in  -proved  statements  that 
one  can  replace  an  unknown  by  a  variable.    In  some  statement  such  as 

x'  -  4a:  +  3  =  0 

which  cannot  be  proved,  the  x  cannot  become  variable. 

Because  the  same  letters  are  often  used  for  both  unknowns  and  variables, 
and  because  in  proved  theorems  unknowns  can  be  changed  to  variables, 
the  distinction  between  an  unknown  and  a  variable  is  often  not  carefully 
observed  in  everyday  mathematics.  Instead,  one  avoids  confusion  by 
paying  attention  to  the  meanings  of  the  statements  which  one  is  consider- 
ing. In  symbolic  logic,  where  no  attention  is  paid  to  meanings,  we  shall 
have  to  distinguish  between  variables  and  unknowns  solely  on  the  basis 
of  the  form  of  the  statements  in  which  they  occur.  This  will  require  a 
careful  attention  to  details  which  could  be  ignored  in  case  one  is  relying  on 
meaning  to  prevent  confusion.  It  will  also  permit  a  basic  simplification  of 
our  approach  to  variables  and  unknowns.  In  mathematics,  a  variable  is  a 
shadowy,  ill-defined  entity  which  varies  over  some  range  and  which  is 
denoted  by  a  letter  x.  In  symbolic  logic,  where  we  are  not  concerned  with 
meanings,  the  variable  as  such  disappears,  and  we  have  left  only  the  letter 
X,  now  denoting  nothing  whatever.  This  has  the  advantage  of  freeing  us 
from  explaining  the  diflficult  concept  of  a  variable.  The  disadvantage  is 
that  our  letters  x,  y,  z  must  now  be  manipulated  by  purely  mechanical  rules, 
instead  of  by  reference  to  an  intuitive  idea  of  a  variable.  However,  for 
those  who  find  the  intuitive  idea  of  a  variable  hard  to  grasp,  this  disadvan- 
tage becomes  an  advantage. 

Although  we  now  operate  entirely  without  variables  or  unknowns,  and 
only  with  letters,  we  still  wish  to  manipulate  our  letters  as  though  some 
denoted  variables  and  some  unknowns.  So,  in  the  next  sections,  where  we 
set  up  our  rules  for  manipulating  letters,  we  wish  to  keep  reminding  the 
reader  that  certain  letters  are  to  be  treated  as  though  they  denoted  vari- 
ables and  certain  others  as  though  they  denoted  unknowns  (this  is  not 
necessary  to  an  understanding  of  the  rules,  but  we  think  it  will  be  helpful 
to  our  readers).  We  shall  do  this  by  saying  that  such  and  such  letter  is 
serving  as  a  variable,  while  such  and  such  other  letter  is  serving  as  an 
unknown. 

2.  Quantifiers.  We  shall  use  the  same  letters  to  serve  both  as  variables 
and  as  unknowns.    This  is  an  objectionable  procedure,  but  it  leads  only  to 
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complication  and  not  to  fallacy.  Also,  it  conforms  to  standard  mathe- 
matical usage,  and  we  follow  it.  We  shall  call  these  letters  'Variables," 
even  though  some  occurrences  may  be  serving  as  unknowns  and  some  as 
variables.  Those  occurrences  which  are  serving  as  unknowns  will  be  called 
"free"  occurrences  of  the  variable  and  those  occurrences  which  are  serving 
as  variables  will  be  called  "bound"  occurrences.  The  distinction  between 
free  and  bound  occurrences  will  be  based  entirely  on  the  form  of  the  state- 
ment, and  not  at  all  on  its  meaning. 

We  cannot  defend  this  terminology  on  the  ground  that  it  is  good,  because 
it  is  not.  It  is  merely  traditional.  In  the  first  place,  having  disposed  of  the 
notion  of  a  variable  and  confined  our  attention  to  letters  alone,  it  is  very 
silly  to  turn  around  and  call  these  letters  "variables,"  especially  since  they 
serve  sometimes  as  variables  and  sometimes  as  unknowns.  Also,  there  is 
certainly  nothing  about  the  words  "free"  or  "bound"  to  suggest  serving  as 
unkno\\Tis  and  variables;  if  anything,  the  terminology  seems  just  back- 
ward.   However,  it  is  the  accepted  terminology. 

Besides  &,  ^,  and  variables,  with  which  we  have  acquainted  the  reader, 
we  now  introduce  the  notation  "(x)",  namely,  a  variable  enclosed  in  a  pair 
of  parentheses.  This  combination  of  symbols  is  to.  be  prefixed  to  state- 
ments. From  the  point  of  view  of  the  symbolic  logic,  we  are  quite  indiffer- 
ent to  any  proposed  meaning  for  this  and  are  concerned  only  with  the  formal 
axioms  for  it.  However,  these  formal  axioms  will  be  much  more  under- 
standable and  easy  to  remember  if  we  first  discuss  the  interpretation  that 
would  be  put  on  "(x)"  if  one  were  to  consider  meanings. 

We  interpret  the  prefix  "(a;)"  to  denote  any  one  of: 

"For  all  X,  ...  ", 
"For  every  x,  .  .  .  ", 
"For  each  x,  .  .  .  ". 

Thus,  if  we  have  a  statement  F(x)  containing  some  occurrences  of  x,  the 
statement  (x)  F{x)  shall  have  the  interpretation  that  for  every  possible 
value  of  X  whatsoever,  the  statement  F(x)  is  true.  Thus  we  might  take 
F{x)  to  be 

a;=^  _  1  =  (x  +  l)(a:  -  1), 

and  then  (x)  F{x)  indicates  that,  for  every  possible  value  of  x, 

a:'  -  1  =  (a:  +  l){x  -  1). 

It  is  of  course  permissible  to  prefix  {x)  to  F{x)  even  in  those  cases  where 
F{x)  is  not  true  for  every  possible  value  of  x.  In  such  case,  the  result 
{x)  F{x)  is  a  false  statement.    Thus  we  can  and  may  write 

{x)  (x'  -  4x  -h  3  =  0), 
but  it  is  false. 
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We  shall  also  permit  prefixing  {x)  to  a  statement  P  even  if  it  does  not 
contain  any  occurrences  of  x.  In  such  case,  {x)  P  shall  mean  the  same 
thing  as  P.    Thus  we  may  write 

(x)  (2  +  2  =  4), 

which  merely  means  "2  +  2  =  4,"  and  is  true.    Also  we  might  write 

(x)  (4  is  an  odd  prime), 

which  merely  means  "4  is  an  odd  prime,"  and  is  false. 

The  reader  should  realize  that  the  meanings  indicated  above  for  (x)  F(x) 
and  (x)  P  are  relevant  only  when  we  come  to  interpret  our  proved  state- 
ments as  logical  principles  of  everyday  mathematics.  In  operating  within 
the  symbolic  logic,  only  the  forms  of  our  statements  are  relevant,  and  on 
the  basis  of  form  alone,  it  is  convenient  to  be  allowed  to  prefix  (x)  to  any 
statement  whatever,  without  concern  for  possible  occurrences  of  x  in  the 
statement. 

We  are  now  prepared  to  indicate  which  occurrences  of  variables  (letters, 
that  is)  are  free  and  which  are  bound.  In  a  statement  such  as  "For  all  x, 
x^  —  1  =  (a;  +  l)(a:  —  1),"  clearly  a;  is  a  variable.  So  in  a  corresponding 
statement  {x)  F(x),  we  say  that  all  occurrences  of  x  are  bound,  since  they 
must  necessarily  be  serving  as  variables.  In  fact  we  say  that  they  are  bound 
by  (x),  and  sometimes  we  speak  of  prefixing  a  (x)  to  F{x)  as  "binding  the 
occurrences  of  x  in  F{x)  by  (x)." 

At  the  present  time  we  shall  not  give  any  further  details  as  to  just  exactly 
how  the  variables  occur  in  a  F{x),  but  we  do  state  the  following  conditions 
which  they  satisfy: 

(1)  The  occurrences  of  a  variable  in  '-^P  are  free  or  bound  according  as 
they  are  free  or  bound  in  the  part  P. 

(2)  The  occurrences  of  a  variable  in  P&Q  are  free  or  bound  according  as 
they  are  free  or  bound  in  the  parts  P  and  Q. 

(3)  All  occurrences  of  x  in  (x)  P  are  bound,  but  for  any  other  variable 
the  occurrences  in  (x)  P  are  free  or  bound  according  as  they  are  free  or 
bound  in  the  part  P. 

The  condition  (1)  tells  us  that  '-^  neither  binds  nor  frees  the  occurrences 
of  a  variable.  This  agrees  with  the  fact  that  negating  a  statement  does  not 
alter  the  status  of  any  unknowns  or  variables  occurring  therein.  (2)  gives 
similar  information  about  &.  (3)  tells  us  that  (x)  binds  all  occurrences  of  x. 
That  is,  prefixing  a  (x)  corresponds  to  the  important  mathematical  opera- 
tion of  changing  an  unknown  to  a  variable.  However,  prefixing  a  (x)  does 
not  change  the  status  of  any  other  unknowns  or  variables  except  x. 

Let  us  contrast  the  situations  in  ordinary  mathematics  and  in  symbolic 
logic.    In  ordinary  mathematics,  the  x's  in 

x^  -  I  =  (x+  l)(x  -  1) 
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would  ordinarily  denote  a  variable,  but  on  occasion  they  might  denote  an 
unknown  (for  instance,  throughout  a  proof  of  the  statement).  However, 
from  the  statement  alone,  there  is  no  way  to  tell  which  is  intended.  In 
symbolic  logic  we  use  two  different  statements  to  take  the  place  of  the 
single  statement  of  ordinary  logic,  namely,  the  two  statements 

x^  -  I  =  (x-\-  l)(a;  -  1) 
and 

(x)  (x'  -  1  =  {xi-  l){x  -  1)). 

In  the  former,  all  occurrences  of  x  are  free  and  x  serves  as  an  unknown. 
In  the  latter,  all  occurrences  of  x  are  bound  and  x  serves  as  a  variable. 
Thus,  in  symbolic  logic  one  can  determine  the  role  of  a  letter  in  a  statement 
simply  by  the  form  of  the  statement. 

One  should  take  care  not  to  be  confused  by  the  fact  that  the  statement 

a;^'  -  1  =  (x  +  l)(.x  -  1) 

occurs  in  both  everyday  mathematics  and  symbolic  logic,  and  has  two 
possible  meanings  in  everyday  mathematics  but  only  one  meaning  in 
symbolic  logic,  namely,  the  less  commonly  used  of  its  two  meanings  in 
everyday  mathematics. 

The  prefix  (x)  is  called  the  universal  quantifier,  since  (x)  F{x)  signifies 
that  F{x)  is  universally  true.  In  terms  of  (x),  we  can  define  the  existential 
quantifier  (Ex),  signifying  "For  some  x  .  .  .  ".  This  is  done  by  noting  that 
ix)  ^F{x)  signifies  that  F(x)  is  false  for  every  x,  so  that  its  negation 
'~(x)  ^F{x)  signifies  that  F(x)  must  be  true  for  at  least  one  x.  So  we 
define  (Ex)  F{x)  to  be  an  abbreviation  for  ~(a;)  ~i^(a;).  Then  (Ex)  F{x) 
signifies  any  one  of: 

"For  some  x,  F{x)." 
"For  at  least  one  x,  F{x)  is  true." 
"There  is  an  x  such  that  F{x)." 
"There  exists  an  x  such  that  F{x)." 

It  is  doubtless  the  last  interpretation  which  suggested  the  name  "existen- 
tial quantifier"  for  (Ex). 

Notice  that  since  (Ex)  contains  (x)  as  a  part,  all  re's  in  (Ex)  F(x)  are 
bound. 

By  combining  (Ex)  with  the  notion  of  equality,  we  shall  later  learn  how 
to  write  such  prefixes  as  "For  exactly  one  x  .  .  .  ",  "For  at  least  three 
different  x's  .  .  .  ",  and  the  like. 

If  a  statement  contains  occurrences  of  several  variables,  x,  y,  .  .  .  ,  it  is 
natural  that  one  may  wish  to  prefix  several  of  (x),  (y),  .  .  .  ,  or  (Ex),  (Ey), 
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....  Thus  since  x^  —  y^  =  {x  -[-  y)  (x  —  y)  is  true  for  every  y,  we  indicate 
this  by  prefixing  (y),  getting  (y)  (x^  —  2/^  =  (^  +  y){^  —  y))-  This  state- 
ment in  turn  is  true  for  every  x,  and  we  indicate  this  by  prefixing  (x), 
getting  (x)(y)  (a;^  -  y"  =  (x  +  y)(x  -  y)). 

This  illustration  should  indicate  that  the  double  prefix  {x)(y)  denotes 
"For  all  X  and  y  .  .  .  ".  Similarly  the  triple  prefix  (x){y)(z)  denotes  "For 
all  X,  y,  and  z  .  .  .".  We  commonly  abbreviate  (x)(y)  to  (x,y),  and  (x)(y)(z) 
to  (x,y,z),  and  so  on. 

In  like  manner,  we  see  that  (Ex)(Ey)  denotes  "For  some  x  and  i/  .  .  .  ", 
and  (Ex)(Ey)(Ez)  denotes  "For  some  x,  y,  and  z  .  .  .  ".  We  abbreviate 
these  to  (Ex,y)  and  (Ex,y,z). 

Still  other  combinations  are  possible.  Thus  for  each  x  we  can  find  at 
least  one  value  of  y  such  that  x^  -{-  y  =  x  -{-  y^  (namely,  either  y  =  x  or 
y  =1  —  x).  We  express  this  information  by  the  statement  (x)(Ey) 
(x'  -]r  y  =  X  -{-  y^).  This  is  not  the  same  as  (Ey){x)  (x^  -j-  y  =  x  -{-  y^), 
for  the  latter  states  that  there  is  some  y  such  that  x^  +  y  =  x  -\-  y^  for 
every  x,  which  is  clearly  not  so. 

We  have  already  called  attention  to  the  mathematical  statement 

x^  dx  =  — 
o 

in  which  two  occurrences  of  x  are  variables  of  integration  and  the  other  two 
may  be  unknowns.  Similarly,  in  symbolic  logic  we  can  have  both  bound 
and  free  occurrences  of  x  in  a  single  statement.  One  difference  is  that  there 
will  not  be  any  doubt  as  to  which  occurrences  are  bound  and  which  free. 
Consider  (Ex)  (x  =  3),  in  which  all  occurrences  of  x  are  bound,  and  x  =  7, 
in  which  all  occurrences  of  x  are  free.  We  can  form  the  logical  product  of 
these  two  and  get 

(x  =  7)&(Ex)  (x  =  3), 

in  which  the  leftmost  occurrence  of  x  is  free  whereas  the  other  two  occur- 
rences of  X  are  bound. 

In  trying  to  assign  a  meaning  to  this  statement,  the  free  and  bound  occur- 
rences of  X  have  nothing  to  do  with  each  other.  Considered  as  a  statement 
about  X,  only  the  free  occurrence  of  x  is  involved,  and 

(X  =  7)&{Ey)  (y  =  3) 
or 

(x  =  7)&(Ez)  (z  =  3) 

would  be  considered  to  be  the  same  statement  about  x.  In  fact  all  three 
make  the  statement  that  x  =  7  and  there  is  some  object  which  equals  3. 
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To  make  a  corresponding  statement  about  y,  one  would  write 

(y  =  7)&(Ex)  (x  =  3) 
or 

(y  =  7)&(Ey)  (y  =  3) 
or 

(y  =  7)&{Ez)  (z  =  3), 

etc.,  since  each  of  these  statements  signifies  that  y  —  7  and  there  is  some 
object  which  equals  3. 

This  situation  is  quite  analogous  to  that  in  which 


x^  dx  =  -^  , 


and 


f 

Jo 


z^  dz  =  -^ 


all  make  the  same  statement  about  x,  and 


and 


z'dz  =  %r 


all  make  the  corresponding  statement  about  y. 

We  can  put  this  in  more  general  terms.  If  we  have  a  statement  F{x) 
about  X,  only  free  occurrences  of  x  count  in  the  meaning  of  F{x).  So  to 
get  a  corresponding  statement  about  y,  we  replace  the  free  occurrences  of  x 
by  occurrences  of  y.  Notice  that  this  procedure  of  replacing  all  free  occur- 
rences of  X  by  occurrences  of  y  to  get  the  corresponding  statement  about  y 
is  a  purely  formal  procedure  and  can  be  carried  out  without  any  reference 
to  meaning. 

There  are  certain  obvious  cases  in  which,  if  we  have  a  statement  about  x, 
F{x),  and  replace  all  free  occurrences  of  x  by  occurrences  of  y,  then  the 
resulting  statement  about  y,  F(y),  is  not  a  corresponding  statement  about  y; 
to  wit,  if  the  original  statement  contains  free  occurrences  of  y  as  well  as  x. 
Thus  if 

F{x)  is  {x  -  2)(y  +  2)  =  xy  +  2x  -  2y  -  4, 
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then 

F{y)  is  {y  -  2){y  -^  2)  =  y'  +  2y  -  2y  -  ^, 

which  is  quite  a  different  statement  about  y.  However,  in  a  case  Hke  this, 
one  would  usually  denote  the  first  statement  by  F{x,y)  and  the  second  by 
F(y,y).    In  other  words,  instead  of  speaking  of 

(x  -  2){y  +  2)  =  xy  +2x  -  2y  -  4: 

as  a  statement  about  x,  we  usually  speak  of  it  as  a  statement  about  both 
X  and  y;  it  is  clearly  then  nonsense  to  ask  for  a  corresponding  statement 
about  y  alone. 

Actually  situations  of  the  sort  just  described  cause  no  trouble  if  one  is 
careful  not  to  presume  that  a  formula  with  free  occurrences  of  x  does  not 
also  contain  free  occurrences  of  y.  However,  there  is  a  more  subtle  situa- 
tion that  we  must  guard  against  in  which  we  can  have  a  statement  F(x) 
with  free  occurrences  of  x  only,  and  which  therefore  is  a  statement  about 
x  only,  which  is  such  that,  if  we  replace  these  free  occurrences  of  x  by 
occurrences  of  y,  then  we  shall  not  get  a  corresponding  statement  about  y. 
Thus  consider 

(VI.2.1)  (x  =  7)&(E2/)  (y  ^  x). 

This  says  that  x  =  7  and  there  is  some  object  which  differs  from  x.  If 
we  now  replace  all  free  occurrences  of  x  by  occurrences  of  y,  we  get 

(VI.2.2)  {y  =  7)&(Ey)  {y  ^  y). 

This  says  something  different  about  y^  to  wit,  ^'y  =  1  and  there  is  some 
object  which  differs  from  itself." 
The  trouble  clearly  is  that,  whereas  the  occurrence  of  x  is  free  in 

(E^)  {y  9^  x), 

an  occurrence  of  y  in  the  same  position  is  bound.  If  we  should  express  our 
original  statement  in  the  quite  equivalent  form 

{x  =  7)&(Ez)  {z  7^  X), 

then  we  could  replace  the  free  occurrences  of  x  by  occurrences  of  y  and  get 

{y  =  7)&(E^)  {z  9^  y) 

which  does  say  the  same  thing  about  y. 

A  quite  analogous  situation  can  occur  in  everyday  mathematics.  Thus, 
consider  the  statement 

J    {2y  -\-x)dy  ==  2x\ 
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If  we  replace  x  by  y,  we  get 

r  (2y-\-  y)  dy  =  2y\ 

which  is  not  the  corresponding  statement  about  y.  In  fact,  the  first  is  a 
true  statement  about  x,  but  the  second  is  a  false  statement  about  y. 

In  a  situation  illustrated  by  (VI. 2.1)  and  (VI. 2. 2),  in  which  upon 
substituting  occurrences  of  y  for  the  free  occurrences  of  x  we  get  a  bound 
occurrence  of  y  where  there  was  a  free  occurrence  of  re,  we  say  that  the 
substitution  causes  confusion.  Although  we  are  using  the  word  "con- 
fusion," which  usually  has  to  do  with  meaning,  we  are  actually  describing 
a  purely  formal  situation.  In  fact  let  us  give  the  following  precise  and 
purely  formal  definition. 

If  P  is  a  statement  and  Q  is  the  result  of  replacing  each  free  occurrence 
of  X  (if  any)  in  P  by  an  occurrence  of  y,  then : 

(1)  If  some  bound  occurrence  of  ?/  in  Q  is  the  result  of  replacing  a  free 
occurrence  of  a:  in  P  by  an  occurrence  of  y,  then  we  say  that  the  replacement 
causes  confusion. 

(2)  If  no  bound  occurrence  oi  y  m.  Q  is  the  result  of  replacing  a  free 
occurrence  of  a:  in  P  by  an  occurrence  of  y,  then  we  say  that  the  replace- 
ment causes  no  confusion. 

We  stated  this  in  a  form  which  allows  the  possibility  that  there  might  be 
no  free  occurrences  of  x  in  P.  In  such  case,  Q  would  be  the  same  as  P,  and 
clearly  the  replacement  would  cause  no  confusion.  We  also  permit  P  to 
contain  free  occurrences  of  y.  Then  the  case  in  which  the  replacement 
causes  no  confusion  would  be  characterized  by  writing  F{x,y)  for  P  and 
F(y,y)  for  Q,  and  the  meanings  of  P  and  Q  would  be  related  in  a  corre- 
sponding fashion. 

Theorem  VI. 2.1.  If  Q  is  the  result  of  replacing  all  free  occurrences  of  x 
in  P  by  occurrences  of  y,  and  P  is  the  result  of  replacing  all  free  occurrences 
oi  y  in  Q  by  occurrences  of  x,  then  neither  replacement  causes  confusion. 

Proof.  Fix  attention  on  some  free  occurrences  of  a;  in  P.  It  becomes  an 
occurrence  of  y  in  Q  when  we  replace  all  free  occurrences  of  x  in  P  by 
occurrences  of  y.  Let  us  now  replace  all  free  occurrences  oi  y  in  Q  by 
occurrences  of  x.  It  is  assumed  that  the  result  of  this  is  P.  So  our  particu- 
lar occurrence  of  y  must  be  replaced  by  an  occurrence  of  x,  which  can  be 
the  case  only  if  this  occurrence  of  y  is  free  in  Q.  Applying  this  reasoning  in 
turn  to  each  free  occurrence  of  x  in  P,  we  conclude  that  every  free  occur- 
rence of  X  in  P  becomes  a  free  occurrence  of  y  in  Q.  Hence  this  replacement 
causes  no  confusion.  Similarly,  each  free  occurrence  oi  y  in  Q  becomes  a 
free  occurrence  of  x  in  P,  and  this  replacement  also  causes  no  confusion. 

In  case  P  and  Q  are  related  as  in  the  hypothesis  of  this  theorem,  P  will 
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contain  no  free  occurrences  of  y,  and  Q  will  contain  no  free  occurrences  of  x. 
Also,  by  our  theorem,  it  will  cause  no  confusion  to  replace  free  occurrences 
of  a:  in  P  by  occurrences  of  y,  and  vice  versa.  In  such  case,  Q  will  make 
exactly  the  same  statement  about  y  that  P  makes  about  x.  This  would 
generally  be  characterized  by  writing  Fix)  for  P  and  F{y)  for  Q. 

In  order  to  save  much  repetition  of  assumptions,  we  set  up  the  following 
conventions. 

In  case  we  refer  to  two  statements  as  F{x,y)  and  F{y,y),  it  shall  be 
assumed  that  F{y,y)  is  the  result  of  replacing  each  free  occurrence  of  x 
(if  any)  in  F{x,y)  by  an  occurrence  of  y  and  that  this  replacement  causes  no 
confusion.  It  is  not  assumed  that  there  actually  are  free  occurrences  of 
either  x  ov  y  m  F(x,y),  nor  is  it  assumed  that  there  are  not  free  occurrences 
of  other  variables  in  F{x,y). 

In  case  we  refer  to  two  statements  as  F{x)  and  F{y),  it  shall  be  assumed 
that  F{y)  is  the  result  of  replacing  all  free  occurrences  of  x  in  F{x)  by 
occurrences  of  y,  and  F{x)  is  the  result  of  replacing  all  free  occurrences  of 
y  in  F{y)  by  occurrences  of  x.  It  is  not  assumed  that  there  actually  are 
free  occurrences  of  x  in  Fix),  nor  is  it  assumed  that  there  are  not  free 
occurrences  of  variables  other  than  x  and  y  in  Fix).  Our  assumptions  do 
assure  that  there  are  no  free  occurrences  of  y  in  Fix)  and  no  free  occurrences 
of  x  in  Fiy). 

Just  as  one  could  write 

/    x^  dx  dx  =    I    —  dx  =  ~  > 

0      Jo  -'O"  t'^ 

so  we  could  prefix  an  (Ex)  or  ix)  to  / 

ix  =  7)&iEx)  ix  =  3), 
getting,  for  instance, 

(Ex)  iix  =  7)&(Ex)  ix  =  3)). 

We  shall  seldom  have  any  occasion  to  write  anything  of  this  sort,  but 
there  is  no  harm  in  doing  so.    Since  in 

ix  =  7)&iEx)  ix  =  3), 

only  the  leftmost  x  is  relevant  to  the  meaning  of  the  statement,  only  this  x 
is  bound  by  the  additional  (Ex)  which  we  propose  to  prefix.  Or  to  put  it 
otherwise,  since 

ix  =  7)&(Ex)  (x  =  3) 
and 

(x  =  7)&iEy)  iy  =  3) 
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are  supposed  to  have  the  same  meanings,  we  attach  to  (Ex)  ((x  =  7)&(Ex) 
(x  =  3))  the  same  meaning  that  we  would  attach  to 

(Ex)  ((x  =  7)&(Ey)  {y  =  3)). 

Such  questions  of  meaning  arise  only  when  we  come  to  interpret  our 
statements.  In  the  formal  manipulations,  these  matters  cause  no  trouble 
at  all. 

A  similar  matter  is  the  question  of  the  meaning  to  be  given  to  (x)  (x)  P 
or  (x)(Ex)  P  or  (Ex)(x)  P  or  (Ex)  (Ex)  P.  Consider  (x)(x)  P.  Since  all 
x's  in  (x)  P  are  bound,  it  follows  that,  as  far  as  meaning  is  concerned, 
(x)  P  contains  no  x's.  In  such  case  (x)(x)  P  has  the  same  meaning  as 
(x)  P.    Similarly  for  the  others. 

With  regard  to  the  omission  of  parentheses  without  ambiguity,  we  shall 
agree  that  the  abbreviations  listed  in  the  left-hand  column  below  have  the 
meanings  listed  beside  them  in  the  right-hand  column. 

ix)PQ (x)  (PQ) 

(x)  P&Q      (x)  (P&Q) 

(x)  PvQ ((x)  P)wQ 

(x)PD  Q ((x)  P)D  Q 

(x)P^Q ((x)  P)^Q 

That  is,  &  is  weaker  than  (x),  but  v,  3 ,  and  =  are  all  stronger  than  (x). 
Clearly  there  is  no  ambiguity  in  such  forms  as  (x)  ~P,  ~(x)  P,  P(x)  Q, 
Pv(x)  Q,PD  (x)  Q,  and  P  ^  (x)  Q. 

We  use  exactly  similar  conventions  for  (Ex).  We  do  not  need  to  choose 
which  of  (x)  or  (Ex)  shall  be  stronger,  for  there  is  clearly  no  ambiguity  in 
such  forms  as  {x){Ey)  P,  (Ex)(y)(z)  P,  etc. 

The  conventions  for  {x,y)  are  the  same  as  for  (x)  and  the  conventions  for 
(Ex,^)  are  the  same  as  for  (Ex). 

Dots  can  be  used  to  strengthen  symbols  just  as  before.  Thus  the  abbrevi- 
ations in  the  left-hand  column  below  have  the  meanings  which  are  listed 
beside  them  in  the  right-hand  column. 

(x)P.Q      {(x)P)Q 

(x).PvQ (x)  {PyQ) 

(x).P  D  Q.wR ((x)  (P  D  Q))yR 

(x):P  D  Q.yR (x)  {{P  D  Q)yR) 

Let  us  now  look  at  some  statements  of  ordinary  mathematics  which 
require  quantifiers  for  their  translation  into  sjonbolic  logic.  Consider  the 
statement  "f(x)  is  continuous  at  the  point  x."  First  of  all,  let  us  write  the 
definition  of  this  in  ordinary  language : 

"For  each  positive  e  there  is  a  positive  8  such  that  whenever  \y  —  x\  <  8 
we  have  |  f{y)  —  /(x)  |  <  e." 
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We  adopt  z  >  0,  5  >  0,\y  —  x\  <  b,  and  |  J{y)  —  f(x)  |  <  s  as  state- 
ments of  symbolic  logic,  and  inquire  how  we  shall  put  them  together  with 
quantifiers  and  logical  connections  to  get  a  translation  of  the  above 
statement. 

Consider  the  part  "whenever  \  y  —  x\  <  5  we  have  |  f(y)  —  f{x)  \  <  e." 
If  we  said  merely  ''if  \y  —  x\  <  8,  then  |  f(y)  —  f(x)  |  <  e,"  we  would 
translate  it  as 

\y  -  x\  <  8.D  .\  fiy)  -  f(x)  \  <  e. 

However,  the  use  of  "whenever"  strengthens  this  to 

(y):\y  -  x  \  <  8.  D  .\  f{y)  -  fix)  |   <  e. 

That  is,  the  "whenever"  converts  y  into  a  variable,  whereas  with 
"if  .  .  .  then  .  .  .  ,"  y  might  be  an  unknown. 

How  shall  we  say  "there  is  a  positive  5  such  that  i^(5)"?  If  we  wished 
to  say  merely  "there  is  a  5  such  that  F{8),"  we  could  write 

(E5)  F{8). 

To  specify  that  there  is  a  positive  8  such  that  F(8),  we  are  claiming  that  8 
simultaneously  has  the  two  attributes  "5  is  positive"  and  F{8).  That  is, 
we  are  claiming 

8  >  0.  F(5) 

for  some  8.    In  other  words  we  claim 

(E5).5  >  0.  F(8). 

So  "there  is  a  positive  5  such  that  whenever  \  y  —  x\  <  5  we  have 
I  f(y)  ~  /(^)  I  <  s"  becomes 

(E5):5  >  0:iy):\  y  -  X  \  <  8.  D  .\  f{y)  -  f{x)  \  <  e. 

How  can  we  write  "For  each  positive  e,  G(e)"?  By  analogy  with  our 
treatment  of  8,  one  might  be  tempted  to  write 

(£).e  >  0.  G(e). 

This  is  completely  wrong.    The  logical  product 

£  >  0.  G(e) 

asserts  that  e  both  is  positive  and  possesses  the  property  G(e).    Then 

(e).£  >  0.  G(t) 

asserts  this  for  each  e.  In  other  words,  it  states  that  every  e  is  positive  in 
addition  to  asserting  G(s)  for  each  e.  This  is  quite  absurd  and  is  quite  a 
different  thing  from  asserting  G{z)  for  every  positive  s. 
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Actually,  the  form  we  are  seeking  is 

(e).  £  >  0  D  G(s), 

as  should  become  apparent  from  a  study  of  this  form.  If  we  have  this 
statement  given  then  certainly  we  can  conclude  G{z)  whenever  we  have 
e  >  0.  Also,  from  this  statement  alone,  we  cannot  conclude  G{£)  for  any  s 
for  which  it  is  false  that  e  >  0,  although  (and  this  is  proper)  it  is  not  for- 
bidden that  G{£)  be  true  for  some  e  for  which  e  >  0  is  false. 

So  we  get  finally  as  the  translation  of  "/(x)  is  continuous  at  the  point  a;" 
the  statement 

(e):.£  >  0.   D   :(E5):5  >  0:(z/):|  y  -  X  \   <   b.   D   .\  f{y)   -  f(x)  \   <  e. 

We  say  that  /(x)  is  continuous  in  an  interval  if  it  is  continuous  at  each 
point  of  the  interval.  If  we  translate  "x  is  in  the  interval  (a,6)"  by  the 
familiar  formula  a  <  x  <  h,  then  we  can  translate  "f(x)  is  continuous  in 
the  interval  (a,fe)"  by 

ix)::a  <  X  <h.  D  :.(e):.£  >  0.  D  :{Ed):8  >  0:{y): 

\y  -  x\  <  8.D  .\fiy)  -  fi^)  I  <  £•' 

Often  when  we  are  speaking  of  continuity  in  an  interval,  we  modify  the 
definition  of  continuity  at  the  end  points  so  as  to  require  that  y  as  well  as  x 
be  in  the  interval  (see  Hardy,  1947,  page  186).  This  revised  definition 
would  be  translated  as: 

{x)::a  <  X  <h.  D  :.(e):.£  >  0.  D  :(E5):5  >  0:{y):a  <  y  <  h. 

\y  -  x\  <  8.D  .1/(2/)  -fix)  I  <  e. 

We  say  that  f{x)  is  uniformly  continuous  in  the  interval  (a,  6)  if  for  each 
positive  £  there  is  a  positive  5  such  that  whenever  x  and  y  are  in  the  interval 
(a,h)  and  \  y  —  x\  <  5  we  have  \  fiy)  —  f{x)\  <  e.  This  would  be 
translated  as 

(£)::£  >  0.   D   :.(E5):.5  >  0:.(a;):.o  <  X  <  b.  D  :(y):a  <  y  <  h. 

\y  -  x\  <  8.D  .\  fiy)  -  fix)  \  <  e. 

From  a  formal  point  of  view,  the  only  difference  in  the  two  statements  is 
in  the  location  of  the  part 

ix).a  <  X  <  h  D . 

However,  both  formally  and  intuitively  this  change  is  quite  important. 

The  mathematical  literature  is  composed  mainly  of  statements  which 
require  the  use  of  quantifiers  to  translate  them  into  symbolic  logic  so  that 
we  shall  see  many  more  examples  of  the  use  of  quantifiers. 
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We  should  like  to  caution  the  reader  against  the  word  "any."  Some- 
times "any"  means  "each"  and  sometimes  it  means  "some."  Thus, 
sometimes  "for  any  x  .  .  ."  means  (x),  and  sometimes  "for  any  x  .  .  ."  means 
(Ex).  We  shall  quote  literally  two  examples  from  the  mathematical  lit- 
erature. In  the  first,  the  author  intended  "each"  when  he  wrote  "any," 
and  in  the  second  the  author  intended  "some"  when  he  wrote  "any."  The 
statements  are  "li  Xi,  .  .  .  ,  Xn  are  real  numbers,  and  if  for  any  Xi  we  have 
Xi  =  0,  then  a;?  +  •  •  •  -{-  xl  —  0,"  and  "If  Xi,  .  .  .  ,  Xn  are  vectors,  and  if 
for  any  Xi  we  have  Xi  —  0,  then  the  set  of  vectors  are  linearly  dependent." 

If  one  wishes  to  be  sure  that  one  will  be  understood,  one  should  never 
use  "any"  in  a  place  where  either  of  "each"  or  "some"  can  be  used.  How- 
ever, there  are  cases  in  which  use  of  "any"  is  correct  and  neither  of  "each" 
or  "some"  can  be  used.  For  instance,  "I  will  not  violate  any  law."  The 
statement  "I  will  not  violate  each  law"  has  quite  a  different  meaning,  and 
the  statement  "I  will  not  violate  some  law"  might  be  interpreted  to  mean 
that  there  is  a  particular  law  which  I  am  determined  not  to  violate. 

Nonetheless,  many  writers  use  "any"  in  places  where  "each"  or  "some" 
would  be  preferable. 

EXERCISES 

VI.2.1.  Using  X  >  0  and  a;^  >  0  as  statements  of  symbolic  logic,  write  a 
translation  for  "x^  >  0  whenever  x  >  0." 

VI.2.2.  Using  a  <  b  -\-  c,  c  >  0,  and  a  <  6  as  statements  of  symbolic 
logic,  write  a  translation  for  "If  a  <  b  -\-  c  whenever  c  >  0,  then  a  <  6." 

VI.2.3.  If  F{f{n)),  G{J{n)),  and  H{f{n))  are  translations  of  "/(n)  is  a 
polynomial  with  integral  coefficients,"  "/(n)  is  a  constant,"  and  "/(w)  is  a 
prime,"  write  a  translation  of  "No  polynomial  /(n)  with  integral  coeffi- 
cients, other  than  a  constant,  can  be  prime  for  all  n." 

VI.2.4.  If  F{x)  and  G(x,y)  are  translations  of  "x  is  the  product  of  a 
finite  number  of  factors"  and  "y  is  a  factor  of  x"  and  we  accept  statements 
of  the  form  "z  =  0"  as  statements  of  symbolic  logic,  write  a  translation  of 
"If  the  product  of  a  finite  number  of  factors  is  zero,  at  least  one  of  the 
factors  must  be  zero." 

VI.2.5.  If  F(x)  is  a  translation  of  "x  is  an  integer"  and  we  permit  use 
of  the  symbol  e  and  statements  of  the  form  x  =  y/z  and  x  =  0  in  symbolic 
logic,  write  a  translation  of  "e  is  irrational." 

VI.2.6.  Using  (/,</>„)  =  0  and  /  =  0  as  statements  of  symbolic  logic, 
write  a  translation  for  "(/,0„)  =  0  for  every  n  implies  /  =  0." 

VI.2.7.  If  F{f(x))  and  G{f{x),a)  are  translations  of  "f(x)  is  a  polynomial 
in  re,"  and  "a  is  a  coefficient  of  f{x)"  and  we  accept  statements  of  the  form 
f(x)  =  0  and  a  =  0  as  statements  of  symbolic  logic,  write  a  translation  of 
"a  necessary  and  sufficient  condition  that  a  polynomial  in  x  vanish  identi- 
cally is  that  all  its  coefficients  be  zero." 
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VI.2.8.  If  F(x,y)  and  G{x,y,z)  are  translations  of  "x  divides  ?/"  and 
"a:  is  the  greatest  common  divisor  of  y  and  2"  write  a  translation  of  "Every 
common  divisor  of  a  and  h  divides  their  greatest  common  divisor." 

VI.2.9.  If  F{x)  and  G{x)  are  translations  of  "x  is  an  odd  prime"  and 
"x  is  an  integer"  and  we  accept  statements  of  the  form  1  -^r  x^  -\-  y^  =  mp 
and  0  <  m  <  p  as  statements  of  symbolic  logic,  write  a  translation  of 
"If  p  is  an  odd  prime,  then  there  are  integers  x  and  y  such  that  1  +  x^  -{- 
y^  =  mp  for  some  integer  m  with  0  <  m  <  p." 

VI.2.10.  If  F(x)  and  G{x)  are  translations  of  "x  is  an  integer  of  k(p)" 
and  "x  is  an  integer"  and  we  accept  statements  of  the  form  a  =  a  +  bp  as 
statements  of  symbolic  logic,  Avrite  a  translation  of  "The  integers  of  k(p) 
are  the  numbers  a  =  a  +  bp  with  integral  a,  h." 

VI.2.11.  If  F{x)  is  a  translation  of  "x  is  an  integer"  and  we  accept 
statements  of  the  form  0  <  re  <  1  as  statements  of  symbolic  logic,  write  a 
translation  of  "There  is  no  integer  between  0  and  1." 

VI.2.12.  If  F(x)  is  a  translation  of  "x  is  a  rational  number"  and  we 
accept  statements  of  the  form  x  =  y  and  x  <  y  a,s  statements  of  symbolic 
logic,  write  a  translation  of  "There  is  a  rational  between  any  two  rationals." 

VI.2.13.  If  F{x)  and  G(x)  are  translations  of  "x  is  a  rational  number" 
and  "z  is  a  member  of  S"  and  we  accept  statements  of  the  form  x  <  y  as 
statements  of  symbolic  logic,  write  a  translation  of  "Every  rational  member 
of  S  is  less  than  every  rational  nonmember  of  >S." 

VI.2.14.  If  G{x)  is  a  translation  of  "x  is  a  member  of  S"  and  we  accept 
statements  of  the  form  x  =  y  and  x  <  y  as  statements  of  symbolic  logic, 
write  a  translation  of  "S  has  a  greatest  member." 

VI.2.15.  If  G{x)  is  a  translation  of  "x  is  a  member  of  S"  and  we  accept 
statements  of  the  form  x  =  y,  x  >  0,  and  \  x  —  y\  <  2  as  statements  of 
symbolic  logic,  write  a  translation  for  "For  every  positive  e  there  is  a 
member  of  S  different  from  x  whose  distance  from  x  is  less  than  e." 

VI.2.16.  If  G{x)  is  a  translation  of  "a:  is  a  member  of  S"  and  we  accept 
statements  of  the  form  x  =  y  as  statements  of  symbolic  logic,  write  trans- 
lations of: 

(a)  S  has  at  least  two  members. 

(b)  S  does  not  have  at  least  two  members. 

(c)  S  has  exactly  one  member. 

VI.2.17.  Find  a  statement  F{x,y)  of  mathematics  such  that  (x)(Ey) 
F(x,y)  is  true  but  (Ey){x)  F{x,y)  is  false. 

VI.2.18.  If  the  occurrences  of  the  variables  x  and  y  va.  x  —  y  are  free, 
identify  which  occurrences  of  variables  are  bound  and  which  free  in  the 
following  statements: 

(a)  {^y).y  =  x. 

(b)  {^z){y):z  =  y.  =  .y  ^  X. 
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(c)  X  =  y:  ^  :(z):X  =  z.  ^  .y  =  z. 

(d)  y  =  X.  D  .(Ey)-y  =  ^^ 

VI.2.19.  If  G{x)  is  a  translation  of  "x  is  a  positive  integer"  and  we 
accept  statements  of  the  form  x  >  Q,  x  >  y,  and  \  a^  —  a,\  <  2;  as  state- 
ments of  symbolic  logic,  write  a  translation  for  "For  every  positive  e  there 
is  a  positive  integer  m  such  that  for  every  greater  positive  integer  n, 

I    flm    -    fln    I     <     e." 

3.  Axioms  for  the  Restricted  Predicate  Calculus.  We  now  get  down  to 
formal  matters,  and  state  some  axioms.  Before  doing  so,  we  wish  to  discuss 
one  important  characteristic  which  our  axioms  should  have.  We  stated 
earlier  the  important  principle  that,  if  one  can  prove  a  statement  about  an 
unknown,  then  one  can  replace  the  unknown  by  a  variable.  Within  our 
symbolic  logic,  this  will  take  the  form: 

If  h /^(^),  then  h  (a:)  F{x). 

In  order  to  be  able  to  prove  this,  we  have  to  define  our  axioms  in  such  a 
way  that,  if  Fix)  is  an  axiom,  then  so  is  {x)  F{x),  and,  in  general,  if  P  is  an 
axiom,  then  (a:i)(x2)  •  •  •  (a:„)  P  is  an  axiom.    This  we  do  as  follows: 

If  P,  Q,  and  R  are  statements,  not  necessarily  distinct,  and  x,x■^,X2,  .  .  . ,  x„ 
are  variables,  not  necessarily  distinct,  then  each  of  the  following  is  an  axiom: 

1.  {x,){x,)  •••  {x^)(P  D  PP). 

2.  (x,)ix,)  •••  (x.)  (PQ  D  P). 

3.  (x^)(x,)  •••  (a:„)  (P  D  Q.  D  .^(QR)  D  ^(RP)). 

4.  {x,)(x,)  •••  (x„)  ax).P  D  Q:D  :{x)  P.  D  .(x)  Q). 

If  X,  Xi,  X2,  .  .  .  ,  Xn  are  variables,  not  necessarily  distinct,  and  P  is  a 
statement  with  no  free  occurrences  of  x,  then  the  following  is  an  axiom : 

5.  {x,){x,)  •••  (x„)  (P  D  (x)P). 

If  X,  y,  Xi,  X2,  .  .  .  ,  Xn  are  variables,  not  necessarily  distinct,  and  P  is  a 
statement,  and  Q  is  the  result  of  replacing  each  free  occurrence  (if  any)  of  x 
in  P  by  an  occurrence  of  y,  and  no  bound  occurrence  (if  any)  oi  y  in  Q  is 
the  result  of  replacing  a  free  occurrence  of  a;  in  P  by  an  occurrence  of  y, 
then  the  following  is  an  axiom : 

6.  (x,)ix,)  ■  ■  ■  (x„)  ((x)  P  D  Q). 

By  the  conventions  which  we  set  up  in  the  previous  section,  we  could 
simplify  this  last  to : 

If  X,  y,  Xi,  X2,  .  .  .  ,  Xn  are  variables,  not  necessarily  distinct,  then  the 
following  is  an  axiom: 

6.  {X,){X2)  •••  (Xn)  (ix)  F(x,y)  D  F(y,y)). 
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We  note  that,  in  each  of  the  above,  it  is  permitted  that  n  =  0,  so  that 

1.  P  D  PP. 

2.  PQ  D  P. 

S.  P  D  Q.D  .^{QR)  D  ^(RP). 

4.  {x).P  D  Q:D  :(x)  p.  D  .(x)  Q. 

5.  P  D  (x)  P  if  no  free  occurrences  of  a;  in  P. 

6.  (x)  F{x,y)  D  F{y,y). 

are  axioms. 

We  shall  refer  to  the  forms  indicated  above  as  Axiom  schemes  1  to  6. 
Subsequent  axiom  schemes  which  we  shall  add  in  later  chapters  will  be 
referred  to  as  Axiom  schemes  7,  8,  ...  . 

The  type  of  axiom  scheme,  of  the  form 

{x^){x2)  ••■  M  P, 

will  be  used  in  stating  all  future  axioms.  For  this  reason,  the  following 
theorem  will  be  obvious. 

Theorem  VI.3.1.  If  P  is  an  axiom  and  x  is  a  variable,  then  {x)  P  is  an 
axiom. 

We  might  point  out  certain  properties  of  Axiom  schemes  1  to  6.  Axiom 
schemes  1  to  3  are  just  the  truth- value  axioms  added  to  in  such  a  way  as  to 
make  Thm.VI.3.1  true.  Axiom  scheme  4  is  important  for  the  proof  that  if 
j-  P  then  |-  {x)  P.  Axiom  schemes  5  and  6  tell  us  that,  if  P  has  no  free 
occurrences  of  x,  then  |-  P  =  (x)  P;  for  we  get  [-  P  D  {x)  P  by  Axiom 
scheme  5,  and  by  putting  P  for  F{x,y)  in  Axiom  scheme  6,  we  get 
\-  {x)  P  I)  P.  The  result  ''[-  P  =  (x)  P  if  there  are  no  free  occurrences  of  x 
in  P"  accords  with  our  convention  that,  if  there  are  no  x's  in  P,  then  we 
shall  attach  the  same  meaning  to  (.t)  P  as  to  P.  Finally  Axiom  scheme  6 
gives  a  formal  expression  of  the  principle  that  one  can  replace  a  variable 
by  any  one  of  its  values. 

In  Chapter  VIII  we  shall  discuss  another  type  of  variable,  such  as  in 


Jo 


X  dx  =  9, 
for  which  one  cannot  replace  the  variable  by  any  of  its  values, 

EXERCISES 

VI.3.1.  Find  a  statement  P  that  would  arise  naturally  in  a  mathematical 
discussion  such  that,  if  Q  is  the  result  of  replacing  each  free  occurrence  of 
?;  in  P  by  an  occurrence  of  y,  then  (x)  P  D  Q  is  false.  Explain  why  your 
particular  example  does  not  violate  Axiom  scheme  6. 

VI.3.2.     Prove  that  the  following  statement  is  true  about  each  statement 
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P  which  is  derivable  from  the  Axiom  schemes  1  to  6  by  means  of  modus 
ponens : 

"If  in  P  we  replace  all  occurrences  of  variables  whatever  by  occurrences 
of  the  particular  variable  x,  and  if  we  further  remove  from  P  all  quantifiers 
whatever,  then  the  resulting  expression  Q  is  built  up  from  various  state- 
ments Ri,  .  .  .  ,Rr  (clearly  the  only  variable  occurring  in  any  of  the  R's  is  x) 
by  means  of  ^  and  &  alone,  and  is  of  such  a  form  that  no  matter  what  set 
of  truth  values  we  assign  to  the  R's,  the  corresponding  value  of  Q  is 
truth." 

{Hint.     Use  induction  on  the  number  of  steps  in  the  demonstration  of 

VP-) 

VI.3.3.  Using  the  previous  exercise,  prove  that  one  cannot  derive  a 
contradiction  from  Axiom  schemes  1  to  6  by  use  of  modus  ponens. 

4.  The  Generalization  Principle.  The  generalization  principle  is  that, 
if  P  is  proved,  one  may  infer  {x)  P. 

^Theorem  VI.4.1.     If  P  is  a  statement  and  a:  is  a  variable  and  \-  P,  then 

h  (x)  P. 

Proof.  Let  Si,  S2,  .  .  .  ,  S,  he  &  demonstration  of  \-  P.  Then  S^  is  P, 
and  for  each  Si  either: 

(1)  *S,  is  an  axiom. 

(2)  There  is  a  j  less  than  i  such  that  Si  and  Sj  are  the  same. 

(3)  There  are  j  and  k,  each  less  than  i  such  that  Sk  is  Sj  D  Si. 

We  take  (x)  Si,  (x)  So,  ...  ,  (x)  S,  to  be  key  steps  of  our  demonstration  of 
|-  (x)  P.  The  last  of  our  key  steps,  (x)  S^,  is  (x)  P,  as  desired.  By  filling  in 
additional  steps  before  each  of  our  key  steps,  we  can  build  up  a  complete 
demonstration.    We  do  this  as  follows. 

Case  1.     Si  is  an  axiom.    Then  by  Thm.VI.3.1,  (x)  Si  is  also  an  axiom. 

Case  2.  Si  is  the  same  as  an  earlier  Sj.  Then  (x)  Si  is  the  same  as  an 
earlier  (x)  Sj. 

Case  3.  There  are  an  earlier  Sj  and  an  earlier  S^  such  that  Si,  is  Sj  D  S^. 
That  is,  Sj  and  Sj  D  Sj  occur  before  *S,.  Then  the  key  steps  (x)  Sj  and 
(x).Sj  D  Si  have  already  occurred.    Insert  the  steps 

{x).S,  D  Si:  D  :{x)  Sj  D  (x)  Si 
(x)  Sj  D  (x)  Si. 

The  first  is  an  instance  of  Axiom  scheme  4.  Using  the  first  as  major  premise 
and  (x).Sj  D  Si  as  minor  premise,  we  infer  the  second  by  modus  ponens. 
Then  we  can  use  the  second  as  major  premise  and  (x)  >S,  as  minor  premise 
to  infer  (x)  Si  by  modus  ponens. 

This  theorem  states  the  generalization  principle  that,  if  one  has  proved  a 
statement  F(x)  containing  the  unknown  x,  one  may  generalize  the  unknown 
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X  and  infer  F(x)  with  a  variable  x,  that  is,  (x)  F{x).  Incidentally,  it  is 
interesting  to  contrast  the  justification  which  we  gave  early  in  this  chapter 
of  the  generalization  principle  with  the  proof  just  given  of  Thm.VI.4.1. 
Our  earlier  justification  depended  on  meaning  and  was  carried  out  in  the 
everyday  intuitive  logic  of  mathematics.  The  proof  of  Thm.VI.4.1,  on  the 
other  hand,  is  carried  out  in  the  constructive  intuitive  logic  and  depends 
entirely  on  the  forms  of  the  statements  involved. 

The  generalization  principle  is  so  important  that  we  wish  to  cite  further 
instances  of  it.  First,  though,  we  wish  to  cite  two  results  which  look  rather 
like  instances  of  the  generalization  principle,  but  which  are  not,  namely: 

If  a  continuous  function  is  not  zero  at  some  point,  x,  then  for  all  points 
in  some  neighborhood  of  x,  the  function  is  not  zero. 

If  three  parallel  lines  cut  equal  segments  from  one  transversal,  they  cut 
equal  segments  from  all  transversals. 

The  reader  should  study  these  until  he  is  sure  he  understands  why  they 
are  not  instances  of  the  generalization  principle. 

We  turn  now  to  some  genuine  instances  of  the  generalization  principle. 
An  important  class  of  theorems  whose  proofs  make  use  of  the  generalization 
principle  are  the  locus  theorems  of  geometry.  Thus  let  us  look  at  the  locus 
theorem  which  we  quoted  on  pages  41  and  42  of  this  text.  Here  it  was 
assumed  that  YO  is  the  perpendicular  bisector  oi.  AB,  and  we  wished  to 
prove  that  every  P  on  YO  is  equidistant  from  A  and  B  and  every  P  off  YO 
is  not  equidistant  from  A  and  B.    That  is,  we  wished  to  prove: 

{Py.P  on  YO.  D  .PA  =  PB, 
{P):^{P  on  YO).  D  .PA  9^  PB. 

Suppose  we  look  carefully  at  the  proof  of  the  first  of  these  (the  proof  of 
the  second  involves  the  same  logical  principles) .  Having  assumed  YO  the 
perpendicular  bisector  of  AB,  Wentworth  and  Smith  say:^ 

"Let  P  be  any  point  in  YO,  ..." 

That  is,  they  choose  P  an  arbitrary,  undetermined  point  of  YO,  which 
must  nevertheless  temporarily  stay  fixed  so  that  they  can  construct  some 
triangles  and  prove  them  congruent  and  infer  PA  =  PB.  That  is,  Went- 
worth and  Smith  first  prove 

(VI.4.1)  W,  P  on  FO  h  PA  =  PT 

where  P  is  an  unknown  and  W  stands  for  the  statement  that  YO  is  the 
perpendicular  bisector  of  AB.    From  this  they  proceed  tacitly  to 

(VI.4.2)  TF  h  P  on  70.  D  .PA  =  PB, 

1  From  Wentworth  and  Smith,  op.  cit. 
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(VI.4.3)  W  \-  (Py.P  on  YO.  D  .PA  =  PB, 

(VI.4.4)  \-W  D  :{P):P  on  YO.  D  .PA  =  PB. 

We  can  proceed  from  (VI.4.1)  to  (VI.4.2)  and  from  (VI.4.3)  to  (VI.4.4) 
by  use  of  the  deduction  theorem.  The  step  from  (VI.4.2)  to  (VI.4.3)  uses 
the  generahzation  principle.  However,  we  cannot  carry  out  this  step  by 
use  of  Thm.VI.4.1,  because  Thm.VI.4.1  says  merely  that,  if  |-  V,  then 
|-  (P)  V,  whereas  to  go  from  (VI.4.2)  to  (VI.4.3)  we  need  to  know  that,  if 
W\-V,  then  W  \-  (P)  V. 

Actually,  the  step  from  (VI.4.2)  to  (VI.4.3)  violates  our  proviso  that  one 
can  apply  the  generalization  principle  only  to  proved  statements.  In 
(VI, 4.2),  the  statement 

P  on  YO.  D  .PA  =  PB 

is  not  proved,  but  only  deduced  from  the  assumption  W. 

One  is  tempted  to  jump  to  the  conclusion  that  one  can  apply  the  general- 
ization principle  not  merely  to  proved  statements  but  also  to  statements 
which  have  been  correctly  deduced  from  some  assumption.  However,  this 
is  not  so.  If  it  were  so,  we  could  proceed  as  follows.  Applying  the  general- 
ization principle  to  (VI.4.1)  we  get 

(VI.4.5)  W,  P  on  Y0[-  {P).PA  =  PB. 

However,  (Q).QA  =  QB  is  the  same  as  {P).PA  =  PB.    So  we  conclude 

W,  P  on  Y0\-  {Q).QA  =  QB. 

This  says  that,  if  YO  is  the  perpendicular  bisector  of  AB  and  P  is  on  YO, 
then  every  point  Q  is  equidistant  from  A  and  B. 

This  is  clearly  absurd. 

Thus  we  see  that  it  is  not  permissible  to  apply  the  generalization  principle 
in  (VL4.1),  but  it  is  apparently  permissible  to  apply  the  generalization 
principle  to  (VI.4.2)  to  get  (VI.4.3).  To  get  an  explanation  of  this  dis- 
crepancy, let  us  review  the  intuitive  reasoning  that  one  might  use  to 
justify  deriving  (VI.4.3)  from  (VI.4.2).  In  (VI.4.2)  we  showed  that  for 
some  particular  P  we  can  deduce  P  on  YO.  D  .PA  =  PB  from  W.  Can 
we  conclude  that  this  can  be  done  for  each  P?  Can  we,  for  instance,  con- 
clude that  we  can  deduce  Q  on  YO.  D  .QA  =  QB  from  IF?  The  answer  is 
"Yes,"  for  it  suffices  to  repeat  the  proof  of  (VI.4.2)  with  Q  in  place  of  P. 
Thus  we  can  infer  {P):P  on  YO.  D  .PA  =  PB  from  W. 

If  now  we  try  to  reason  analogously  to  deduce  (VI.4.5)  from  (VI.4.1)  it 
becomes  immediately  apparent  why  this  deduction  fails.  If  we  replace 
P  by  Q  in  the  proof  of  (VI.4.1),  we  do  not  get  W,  P  on  YO^QA  =  QB;  we 
get  instead  W,  Q  on  YO  \-  QA  =  QB.    In  other  words,  the  critical  point 
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in  going  from  (VI.4.2)  to  (VI.4.3)  is  that  W  does  not  depend  on  P,  so  that 
if  we  replace  P  by  Q  in  the  proof  of  (VI.4.2)  we  get  a  proof  of  W  \-  Q  on 
YO.  D  .QA  =  QB  and  so  (VI.4.3)  holds.  Had  W  depended  on  P,  then  we 
could  not  have  applied  the  generalization  principle  to  (VI.4.2)  any  more 
than  to  (VI.4.1). 

A  rather  obvious  way  of  putting  this  is  that,  if  any  of  our  assumptions 
depend  on  P,  as  "P  on  YO"  in  (VI.4.1),  then  they  put  a  restriction  on  P 
which  prevents  use  of  the  generalization  principle.  However,  if  none  of  our 
assumptions  depend  on  P,  as  W  does  not  depend  on  P  in  (VI.4.2),  then 
our  deduction  is  completely  unrestricted  as  far  as  P  is  concerned,  and  use 
of  the  generalization  principle  is  legitimate. 

Quite  clearly,  the  conditions  expressed  above  can  be  stated  in  a  manner 
which  depends  only  on  the  forms  of  the  statements  involved,  and  not  at  all 
on  their  meanings.    We  shall  do  so  in  the  following  theorem. 

**Theorem  VI.4.2.  If  Pi,  Pa,  .  .  .  ,  P„,  Q  are  statements,  not  necessarily 
distinct,  and  a;  is  a  variable  which  has  no  free  occurrences  in  any  of 
P„  P.,  .  .  .  ,  P„,  and  if  P„  P„...,P.\-Q,  then  P„  P„  .  .  .  ,  P„  h  (x)  Q. 

Proof.  Let  8^,82,  ...  ,  *S,  be  a  demonstration  of  Pi,  P2,  ...  ,  P„  h  Q. 
Then  *S,  is  Q,  and  for  each  aS,  either: 

(1)  Si  is  an  axiom. 

(2)  Si  is  a  P. 

(3)  There  is  a  j  less  than  i  such  that  Si  and  >S,  are  the  same. 

(4)  There  are  j  and  k,  each  less  than  i  such  that  Sk  is  Sj  D  Si. 

We  take  (x)  Si,  (x)  S2,  .  .  .  ,  (x)  S,  to  be  key  steps  of  our  demonstration 
and  proceed  to  fill  in  additional  steps  as  in  the  proof  of  Thm.VI.4.1  in  the 
cases  (1),  (3),  and  (4).  In  the  case  (2),  where  Si  is  a  P,  we  insert  Si  and 
Si  D  (x)  Si  as  extra  steps  before  the  key  step  (x)  Si.  Because  >S,  is  a  P, 
it  is  acceptable  as  a  step  of  our  demonstration.  Also,  since  there  are  no 
free  occurrences  of  x  in  any  of  the  P's,  there  are  no  free  occurrences  of  x  in 
Si,  and  so  Si  D  (x)  Si  is  an  instance  of  Axiom  scheme  5.  With  >S.  and 
Si  D  (x)  Si  inserted,  we  get  the  key  step  (x)  Si  by  modus  ponens. 

This  theorem  enables  us  to  use  the  generalization  principle  in  symbolic 
logic  whenever  it  would  be  considered  right  to  use  it  in  everyday  mathe- 
matics, and  only  then.  For  this  reason,  we  shall  often  refer  to  Thm.\T.4.2 
as  the  generalization  theorem. 

The  generalization  principle  is  so  widely  used  in  everyday  mathematics 
that  it  is  usually  taken  for  granted  and  used  without  specific  mention. 
We  shall  cite  one  instance  of  its  use  which  is  out  of  the  ordinary  in  that 
there  is  explicit  mention  of  the  generalization  principle. 

On  page  275,  Wentworth  and  Smith  state  the  definition  :^ 

"If  a  straight  line  drawn  to  a  plane  is  perpendicular  to  every  straight 

1  Ibid. 
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line  that  passes  through  its  foot  and  lies  in  the  plane,  it  is  said  to  be  yer- 
pendicular  to  the  plane." 

On  the  next  page,  they  state  and  prove  :^ 

"If  a  line  is  perpendicular  to  each  of  two  other  lines  at  their  point  of 
intersection,  it  is  perpendicular  to  the  plane  of  the  two  lines." 

We  shall  quote  only  the  beginning  of  the  proof  and  the  step  at  which  the 
generalization  principle  is  used.    The  proof  begins:^ 

"Given  the  line  AO  perpendicular  to  the  lines  OP  and  OR  at  0. 

"To  prove  that  AO  is  perpendicular  to  the  plane  MN  of  these  lines. 

"Proof.     Through  0  draw  in  AIN  any  other  line  OQ,  .  .  .  ." 

Now  various  other  lines  are  drawn,  and  triangles  are  proved  congruent, 
and  finally  it  is  shown  that  OQ  is  perpendicular  to  AO.  Then  comes  the 
use  of  the  generalization  principle:^ 

"Therefore  AO  is  perpendicular  to  any  and  hence  to  every  line  in  MN 
through  0." 

EXERCISES 

VI.4.1.  Find  an  illustration  in  the  mathematical  literature  of  the  use  of 
the  generalization  principle. 

VI.4.2.     Write  out  a  complete  demonstration  of  |-  {x)'^{'^PP). 
VI.4.3.     If  there  are  no  free  occurrences  of  x  in  P,  prove 

h  ix).P  D  Q:D  :P  D  (x)  Q. 

5.  The  Equivalence  and  Substitution  Theorems.  Although  we  can  now 
prove  \-  P  =  '^'-^P.  we  still  have  not  proved  that  we  can  replace  P  by  ^^'^P 
or  r^'^P  by  P  whenever  desired.  In  ordinary  mathematics,  where  one 
goes  by  meaning,  such  a  replacement  would  be  immediately  sanctioned, 
since  P  and  r^r^P  have  the  same  meaning.  More  generally,  any  time  we 
have  proved  \-  A  ^  B,  we  should  expect  to  be  able  to  replace  yl  by  5  or 
vice  versa.  In  this  section,  we  shall  prove  the  substitution  theorem,  which 
says  that  we  can  indeed  replace  A  by  B  ii  we  have  \-  A  =  B. 

We  cannot  prove  the  substitution  theorem  in  full  generality  at  the  present 
time.  Thus  we  shall  have  to  give  a  second  proof  at  a  later  time  when  full 
generality  is  attainable.  Nonetheless,  the  present  form  of  the  substitution 
theorem  will  be  of  sufficient  value  to  warrant  proving  it  now,  even  though 
this  will  involve  some  duplication  later  when  we  prove  the  more  general 
form. 

♦Theorem  VI.5.1.     K^)  ^  ^  ^• 

Proof,  (x)  P  D  P  is  an  axiom,  namely,  an  instance  of  Axiom  scheme  6. 
To  see  this,  let  us  refer  back  to  the  first,  unsimplified  version  of  Axiom 
scheme  6.    There  let  us  take  y  to  be  the  same  as  x.    Then  Q  is  the  result  of 

1  Ibid. 
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replacing  all  free  occurrences  of  re  in  P  by  occurrences  of  x,  so  that  Q  is  P 
and  the  replacement  causes  no  confusion,  and  we  can  indeed  apply  Axiom 
scheme  6. 

Theorem  VI.5.2.     h  ix).PQ:  D  :{x)  P.(x)  Q. 

Proof.     By  Thm.IV.4.17,  it  suffices  to  prove  each  of 

h  (x).PQ:   D   :(x)  P, 

h  (X).PQ:   D   :{X)  Q. 

The  proofs  of  these  are  very  similar,  and  so  we  give  a  proof  for  the  second 
only.    By  the  deduction  theorem  (Thm.IV.6.1),  it  suffices  to  prove 

(x).PQ  h  (^)  Q, 

and  by  the  generalization  theorem  (Thm.VI.4.2),  it  suffices  to  prove 

{x).PQ  h  Q 

(since  clearly  there  are  no  free  occurrences  of  re  in  (x).PQ).  To  prove  this 
latter,  we  have 

h  (rr).PQ:  D  -.PQ 
byThm.VI.5.l  and 

[-PQD  Q 
by  Thm.IV.4.18. 

An  alternative  proof  would  be  as  follows.    By  Thm.IV.4.18, 

[-PQDQ. 
So  by  Thm.VI.4.1, 

h  (x):PQ  D  Q. 

But  by  Axiom  scheme  4 

\-  {x):PQ  D  Q.:   D  :.(rc).PQ:   D  :(rc)  Q. 

So  by  modus  ponens 

h  {x).PQ:   D   :(rc)  Q. 

Theorem  VI.5.3.     \-  (rc).P  ^  Q:  D  :(rr)  P  =  (.r)  Q. 
Proof.     By  Thm.VI.5.2, 

h  (x).P  s  Q:  D  :(rr).P  D  Q:{x).Q  D  P. 
However,  by  Axiom  scheme  4, 

h  (x).P  D  Q:D  :(rr)  P  D  (rr)  Q, 
h  (x).Q  D  P:  D  :(rc)  Q  D  (re)  P. 

SobyThm.IV.4.16, 

h  (a;).P  D  Q:(rc).Q  3  P:  D  :(rc)  P  ^  (re)  Q. 
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We  now  prove  the  equivalence  theorem. 

In  both  the  equivalence  theorem  and  the  substitution  theorem  there 
occurs  a  complicated  hypothesis.  Rather  than  write  it  out  twice  in  the 
statements  of  the  two  theorems,  we  temporarily  denote  it  by  Hypothesis  Hi. 

Hypothesis  Hj.  Pi,  P2,  .  ■  .  ,  P„,  A,  B  are  statements  and  Xi,  X2,  .  .  .  ,  Xa 
are  variables.  W  is  built  up  out  of  some  or  all  of  the  P's  and  A  and  B  by 
means  of  &,  '^,  and  (x),  where  each  time  (x)  is  used,  x  is  one  of  rci,  2:2,  •  •  • ,  x^, 
and  where  one  may  use  each  P  or  each  a;  or  A  or  5  more  than  once  if  desired. 
V  is  the  result  of  replacing  some  or  none  of  the  A's  in  W  by  B's. 

The  equivalence  theorem  then  takes  the  form : 

Theorem  VI.5.4.  Assume  Hypothesis  Hi.  Let  ?/i,  1/0,  .  .  .  ,  yt  be  vari- 
ables such  that  there  are  no  free  occurrences  of  any  of  the  x's  in  (yi){y2) 
■■■  (y,)  (A^B).    Then 

h  (2/0(2/2)  ••■  (y»)(A  ^B).  D  .W  ^  V. 

Proof.  Proof  by  induction  on  the  number  of  symbols  in  W,  counting  as  a 
S3''mbol  each  occurrence  of  a  P,  or  of  ^  or  of  5  or  of  '^  or  of  a  (x).  Tem- 
porarily let  X  denote  (?/i)(?/2)  •  •  •  (yt)  (A  =  B). 

Let  W  have  one  symbol.    Then  W  is  A,  B,  or  a.  P. 

Case  I.     Wis  A. 

Subcase  1.  A  is  replaced  by  B.  Then  V  is  B.  So  W  =  V  is  A  =  B. 
By  b  uses  of  Thm.VL5.1,  we  get 

^X  D  .A  ^  B. 

That  is, 

\-  X  D  .W  ^  V. 

Subcase  II.  A  is  not  replaced  by  B.  Then  F  is  A  and  W  =  F  is  A  =  A. 
By  the  truth-value  theorem, 

^X  D  .A  ^  A. 
That  is, 

[-  X  D  .W  ^  V. 

Case  2.  TF  is  5  or  a  P.  Then  V  is  the  same  and  W  ^  V  isW  =  W. 
By  the  truth-value  theorem, 

y  X  D  .W  ^  W. 
That  is, 

\-  X  D  .W  =  V. 

Assume  the  theorem  if  W  has  k  or  fewer  symbols,  where  k  is  a  positive 
integer,  and  let  W  have  k  -\-  1  symbols.  Then  W  has  at  least  two  symbols, 
and  so  must  be  either  ^Q  or  QR  or  (x)  Q,  where  Q  (and  R)  has  k  or  fewer 
symbols. 
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Case  1.  TT  is  ^^Q.  Then  V  is  -^^S,  where  S  is  the  result  of  replacing 
some  or  none  of  the  ^'s  in  Q  by  B's.    So  by  the  hypothesis  of  the  induction 

yx  :^  .Q  ^  s. 

By  the  truth-value  theorem, 

So 

\rX  D  .^Q  =  ^S. 
That  is, 

h  X  D  .W  ^  V. 

Case  2.  W  is  QR.  Then  V  is  ST,  where  S  and  T  are  the  results  of  replac- 
ing some  or  none  of  the  ^'s  in  Q  and  R  by  B's.  So  by  the  hypothesis  of  the 
induction 

[-X  D  .Q  ^  S, 

\-  X  D  .R  =  T. 

By  Thm.IV.4.17, 

\-  X  D  .Q  ^  S.R  ^  T. 

By  the  truth-value  theorem 

\-Q  ^  S.R  =  T.  D  .QR  =  ST. 
So 

h  X  D  .Q/e  =  ST. 
That  is, 

^  X  D  .W  =  V. 

Case  3.  TT  is  (x)  Q.  Then  V  is  (a;)  >S,  where  S  is  the  result  of  replacing 
some  or  none  of  the  A's  in  Q  by  5's.    So  by  the  hypothesis  of  the  induction 

yx  D  .Q^  S. 

So 

X\-Q^  S. 

By  the  generalization  theorem 

X  h  (x)  (Q  ^  S), 

since  part  of  the  hypothesis  of  our  theorem  is  that  there  are  no  free  occur- 
rences of  X  in  X.    Then  by  the  deduction  theorem 

hX  D  {x)  {Q  ^S). 
Then  by  Thm.VI.5.3, 

\-X  D  .{x)Q  ^  (x)S. 
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That  is, 

There  is  a  somewhat  weaker  version  of  the  equivalence  theorem  which  is 
often  useful,  namely: 

**Theorem  VI.5.5.     Assume  Hypothesis  Hi.    li[-A  =  B,  then  yw  =  V. 

Proof.  Clearly  there  are  no  free  occurrences  of  the  x's,  in  {xi){X'^  •  •  • 
(Xa)  (A  =  -B).    So  by  the  previous  theorem 

^{x,)(x,)  •••  (x^)  (A  ^  B).  D  .W^V. 

If  now  we  have  \-  A  ^  B,  then  by  a  uses  of  Thm.VI.4.1,  we  get 

h(^i)(^2)  •••  (^.)  {A  ^  B). 

Then  by  modus  ponens 

We  now  prove  the  substitution  theorem. 

**Theorem  VI.5.6.     Assume  Hypothesis  Hi.    Ii\-  A  =  B  and  \-  W,  then 

Proof.     li\-  A  =  B,  then  [-  TF  =  F  by  the  previous  theorem.    However, 

[-W  =  V.  D  .W  D  V 

by  Axiom  scheme  2,  and  so  [-  TF  D  F  by  modus  ponens.  If  in  addition, 
|-  W,  then  ]-  y  by  modus  ponens. 

Although  the  substitution  theorem  is  an  easy  deduction  from  the  equiva- 
lence theorem,  the  two  theorems  differ  widely  in  their  applications.  If  we 
have  \-  A  =  B,  the  equivalence  theorem  allows  us  to  infer  various  equiva- 
lences of  the  form  \-W=V,  whereas  the  substitution  theorem  allows  us  to 
substitute  occurrences  of  B  for  occurrences  of  A  in  proved  theorems.  That 
is,  given  the  proved  theorem  |-  W,  we  can  substitute  occurrences  of  B  for 
occurrences  of  A  in  it  and  get  [-  V. 

There  is  little  use  of  the  equivalence  theorems  or  the  substitution  theorem 
in  everyday  mathematics.  This  is  because  verj^  commonly,  if  one  has 
\-  A  =  5,  it  is  considered  that  A  and  B  have  the  same  meaning,  and  so  if 
we  replace  A  hy  B,  we  make  no  change  in  the  meaning,  and  so  there  is  no 
need  to  justify  the  replacement.  In  symbolic  logic,  where  the  form  alone  is 
considered,  the  substitution  theorem  is  quite  important.  However,  occa- 
sionally in  fairly  complex  situations,  where  the  meaning  is  not  easy  to 
follow,  the  substitution  theorem  may  be  used  in  ordinary  mathematics. 
Thus,  in  Landau,  1930,  page  44,  it  is  shown  that  the  condition  stated  in 
Satz  120  is  equivalent  to  the  second  condition  in  the  definition  of  a 
"Schnitt",  and  it  is  then  concluded  that  the  former  may  replace  the  latter 
in  the  definition  of  a  "Schnitt".  This  is  a  clear-cut  instance  of  the  use  of 
the  substitution  theorem. 
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EXERCISES 

VI.5.1.     Using  [-  P  =  ~~P,  prove  \-Q  D  (x)  P.  =  .~(Q(Ea;)~P). 

VI.5.2.  Without  using  the  substitution  theorem  or  either  version  of  the 
equivalence  theorem,  prove  [- Q  D  (x)  P.  =  .~(Q(Ex)~P). 

VI.5.3.     Prove  \-  {x).P  ^  Q:  D  :{x).  P  D  R.  ^  .{x).Q  D  R. 

VI.5.4.  If  there  are  no  free  occurrences  of  x  in  P,  prove  \-  {x).P  s 
Q..D  ..P  ^  {x)  Q. 

VI.5.5.  If  there  are  no  free  occurrences  of  x  in  Q,  prove  \-  {x).P  = 
Q:  D  :P  ^  {x)  Q. 

6.  Useful  Equivalences.  The  substitution  theorem  gives  us  a  useful 
application  for  equivalences.  Accordingly,  we  collect  a  large  number  of 
equivalences  in  the  present  section.  We  remark  that,  from  a  given  set  of 
equivalences,  one  can  derive  new  ones  by  use  of  the  equivalence  theorems 
of  the  preceding  section.    Also,  because  of 

h  P  -  Q.  D  .Q  ^  P, 

which  follows  from  the  truth-value  theorem,  the  order  of  an  equivalence  is 
reversible.    One  should  not  forget  that  equivalences  can  be  combined  by 

use  of 

^  P  =  Q.Q  =  /?.  D  .P  =  72. 

One  can  prove  a  large  number  of  equivalences  by  means  of  the  truth-value 
theorem.  We  have  collected  a  group  of  such  equivalences  in  the  next 
theorem. 

Theorem  VI.6.1. 

I.  h  P  =  PP- 
••11    [-  P  =  ~~P. 

III.  \-P  ^  PyP. 

IV.  1-  P  =  .P.Qv-^Q. 
V.  h  P  ^  PyQ'^Q- 

YL.[-P  =  PQwP^Q. 
VII.  h  ^  =  .PvQ.Pv-Q. 
VIII.  [-P  =  .P.PyQ. 
IX.  h  P  =  PyPQ- 

X.  h  P  =  .PyQ-Q  =>  P- 

Xl.\-P  =  :Q^.P^  Q. 
XII.  \-P  =  .P  ^  Qy^Q. 
XIII.  \-r^P  ^  .P  =  Q^Q. 
**XIV.  ^PQ  ^  QP. 
**XV.  h  PvQ  =  QvP. 
**XVI.  h  (PQ)R  =  PiQR). 
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**XVII 

**XVIII 

**XIX 

XX 

XXI 

XXII 

XXIII 

XXIV 

XXV 

XXVI 

XXVII 

XXVIII 

XXIX 

XXX 

XXXI 

XXXII 

XXXIII 

XXXIV 

XXXV 

XXXVI 

XXXVII 

XXXVIII 

XXXIX 

XL 

XLI 

XLII 

**XLIII 

**XLIV 

XLV 

XL  VI 

XLVII 

XLVIII 

XLIX 

L 

LI 

LII 

LIII 

LIV 

LV 

LVI 

LVII 

LVIII 


h  {PsQ)wR  ^  PviQwR). 
h  P.QwR.  ^  .PQwPR. 
h  PwQR.  ^  .PvQ.PyR. 
[-  P  D  R.Q  D  R.  ^  .PyQ  D  R. 
\-P  D  Q.P  D  R.  ^  .P  D  QR. 
[-  Pv.Q  D  R:  ^  :PyQ.  D  .PwR. 
h  Py.Q  =  R:  ^  :PwQ.  ^  .PyR. 
\-P  D  QwR.  ^  .P  J  Q.w.P  D  R. 
\-  PQ  D  R.  ^  .P  D  R.y.Q  D  R. 


hPD 
hPD 
\-PyQ 

h 


.Q  D  R:  ^  :P  D 
^Q  ^  R,   =  ,P   3 


.P  D 
.P  D 


i^^l  r^^ I 


Q. 


(PyQ)    ^    ^P^Q. 

\-  ^(PQ)  ^  ~Pv~Q. 
\-  PQ.  ^  .Pyr^Q.Q. 

h  PvQ  ^  P^QyQ. 

\-  PQ,  -^  ,P,P  D  Q. 

^  PyQ,  ^  :P  D  Q.  D 

[-PQ,  ^  .PyQ.Py^Q.r^PyQ. 

h  PyQ  ^  PQyP^Qy^PQ. 

\-  PQ,  ^  ,PyQ,P  ^  Q. 

\-  PyQ,    ^    ,PQy,,P    ^    r^Q. 

\-  ^(PQ)    ^   P^Qy^PQy^Pr 

\-  ~(PvQ).  =  .Pv~Q.~PvQ.'- 
[-P  3  Q.  ^  .~(P~Q). 

Q,  ^    ~PvQ. 

Q.  ^  .~Q  D  ~P. 

~Q.  ^  .Q  D  ~P. 

D  Q.^  .~Q  D  P. 


^Q 

■^Py^^^ 


Q. 


hPD 
hPD 

hPD 

[-~P 


\-P 
\-P 
hP 
hP 
hP 
hPQ: 


^  .P  D  PQ. 
=  .PyQ  3  Q. 
^  .P  =  PQ. 

^    ,PyQ   ^   Q. 

=  .PQy^PQy^ 
:P  ^  .P  D  Q. 
h  PyQ:  ^  :Q^  .P  D  Q. 
[-P  ^  Q,^  ,p  J  Q,Q  J  p 
^  p  ^  Q,  ^  ,p  D  Q,^p  D 

l-P   =    Q,    =    ,Py^Q,^PyQ. 
^P  ^  Q,   ^   ,^P  ^   ^Q, 
^  P  ^  ~Q.  =  .~P  ^  Q. 


P~Q. 


^Q. 
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LIX.  [-P  ^  Q.  ^  .PQv^P^Q. 

LXI.  h  -(P  ^  Q)  ^  P^Qy^PQ. 
LXII.  h  ~(P  =  Q).  ^  .PwQ.^Py^Q. 
**LXIII.  ^PQ  D  R:  ^  :P  D  .Q  D  R. 
**LXIV.  \-  P  D  .Q  D  R:  ^  :Q  D  .P  D  R. 
LXV.  h  ^~^  =  Q~Q- 
LXVL  h  Pv-P  =  Qv~Q. 
LXVII.  h  P  ^  .Q  =  ^Q- 
LXVIII.  h  P  ^  .^  ^  PyQ- 
LXIX.  h  P  3  :Q  ^  .P  ^  Q- 

Lxx.  h  P  ::>  :P  ^  -Q  =)  ^• 

LXXI.  \-  P  D  :Q  D  .P  ^  Q. 
LXXII.  h  ~P  =>  .P  =  ^Q- 
LXXIII.  \-^P  D  .Q  ^  PwQ. 
LXXIV.  \-^P  D  :~Q  D  .P  ^  Q. 

LXXV.  h  ~P  =  :~Q  ^  .P  ^  Q. 

We  note  one  useful  consequence  of  the  substitution  theorem  in  connec- 
tion with  the  above  Hst  of  equivalences.  Since  we  have  the  commutative 
and  associative  laws  for  the  logical  product  (Parts  XIV  and  XVI),  we  can 
prove  any  two  products  of  several  factors  equivalent  regardless  of  their 
order  and  the  method  of  inserting  parentheses.  Hence  any  such  product 
may  be  substituted  for  any  other  such  in  any  statement.  Hence  we  no 
longer  need  distinguish  such  products,  but  may  write  P1P2  •  •  •  P„  as  stand- 
ing indifferently  for  any  product  of  the  factors  Pi,  P2,  .  .  .  ,  P„  regardless  of 
order  and  grouping. 

Similar  remarks  apply  to  the  logical  sum. 

Although  we  have  not  listed  above  all  equivalences  provable  by  truth 
values  that  have  ever  been  used,  we  have  listed  all  that  are  in  common  use, 
and  many  less  common  ones.  We  have  even  listed  several  variations  of 
some  of  the  more  useful  ones.  Thus  XLV  and  XL VI  are  variations  of  the 
very  important  XLIV.    On  the  other  hand,  we  have  not  listed 

[-  PQ,  ^  .P.^PwQ 
and 

f-  PyQ  ^  Py'^PQ 

which  are  variations  of  XXXII  and  XXXIII  got  by  interchanging  the 
roles  of  P  and  Q. 

We  call  attention  to  LXXI.  From  this,  we  see  that,  if  [-  P  and  |-  Q,  then 
[^  P  =  Q,  and  we  may  substitute  Q  for  P  or  vice  versa.    In  effect,  any  two 
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true  statements  are  equivalent  and  one  may  be  substituted  for  the  other  in 
any  statement.  Correspondingly,  by  LXXIV,  if  |-  -^P  and  \-  ^^Q,  then 
\-  P  =  Q,  and  we  may  substitute  Q  for  P  or  vice  versa.  In  effect,  any  two 
false  statements  are  equivalent,  and  one  may  be  substituted  for  the  other 
in  any  statement. 

There  are  many  useful  equivalences  involving  quantifiers.  We  now  state 
and  prove  some  of  these. 

Theorem  VI.6.2. 

I.     h    (Ex)    P    =     r^{x)    ~P. 

11.  lix)  P  =  '-(Ea;)  ~P. 

III.  \-  ~(Ea:)  P  =  {x)  ~P. 

IV.  h  ~(a^)  P  =  (Ea;)  ^P. 

Proof.  If  we  write  the  left  side  of  Part  I  in  unabbreviated  form,  we  see 
that  it  is  identical  with  the  right  side,  so  that  Part  I  is  an  instance  of 
VQ  =  Q,  and  follows  from  the  truth-value  theorem.  As  Part  I  is  proved  for 
every  P  and  x,  it  \\'\\\  remain  true  if  we  replace  P  by  '^P,  so  that  we  get 

h  (E.t)  ~P  =  ~(a;)  ~--P. 

By  Thm.VI.6.1,  Part  II,  we  are  entitled  to  replace  the  ^^P  by  P, 
which  gives  us 

[-    (Ex)    ^P    =     r^{x)    P, 

and  we  readily  infer  Part  IV.  Now  by  Thm.VI.6.1,  Part  LVIII,  we  get 
Part  III  from  Part  I,  and  Part  II  from  Part  IV. 

We  now  wish  to  introduce  a  powerful  means  of  proving  equivalences, 
known  as  the  duality  theorem.  The  statement  of  the  theorem  involves  the 
notion  of  the  dual  of  a  statement.  To  define  this  notion,  we  must  temporar- 
ily not  think  of  PwQ  and  (Ex)  P  being  defined  as  '^('^P'^Q)  and  '^(x)  -^P, 
but  must  temporarily  think  of  v  and  (Ex)  as  basic  operators  on  the  same 
footing  with  ~,  &,  and  (x).  Now  let  Pi,  P2,  .  .  .  ,  P„  be  statements,  and 
let  W  be  built  up  out  of  some  or  all  of  the  P's  by  use  of  ~,  &,  v,  (x),  and 
(Ex),  where  we  may  use  each  P  as  often  as  desired,  and  may  use  whatever 
variables  we  like  in  the  (x)  and  (Ex),  and  as  often  as  desired.  Then  the 
dual  of  W  is  got  from  W  by  replacing  products  by  sums,  sums  by  products, 
(x)  by  (Ex),  (Ex)  by  (x),  and  each  P,  by  '^P,-.  In  case  any  occurrence  of  a 
Pi  already  has  a  '^  attached  in  front  of  it  in  W,  we  shall  get  a  '^'^P.-  at 
that  place  upon  forming  the  dual,  and  since  we  can  replace  this  by  P.-,  we 
might  as  well  do  so  and  count  this  replacement  as  part  of  the  operation  of 
forming  the  dual.  Accordingly,  we  combine  these  two  operations  into  a 
single  operation  which  we  call  the  operation  of  forming  the  dual.  We 
define  the  combined  operation  as  follows. 
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The  dual  of  W  is  that  statement  which  one  gets  from  W  by  simultane- 
ously performing  the  following  changes : 

1.  If  an  occurrence  of  P.-  does  not  have  a  '^  attached,  attach  one. 

2.  If  an  occurrence  of  Pi  does  have  a  '^  attached,  take  it  off. 

3.  Replace  each  &  by  v  and  vice  versa. 

4.  Replace  each  (x)  by  (Ex)  and  vice  versa. 

Clearly  changes  1  and  2  are  intended  to  produce  the  same  result  as 
replacing  each  P.-  by  '^P,,  and  then  removing  each  '^  '^  produced  by  this. 
In  other  words,  changes  1  and  2  refer  only  to  '^'s  attached  directly  to  P's, 
and  if  a  '~  is  attached  to  a  more  complex  part,  this  is  not  to  be  tampered 
with. 

To  illustrate  the  idea  of  a  dual,  we  shall  list  several  statements  in  a  col- 
unm  below,  with  their  duals  in  a  column  to  the  right. 

P --P 

~P P 

(Ex)  P^Q {x).  ^PvQ 

(x)  (Ey).Pv^QvRS  .    .    .   ~(E.c)  (y).  --PQ(-^Pv~^) 

(^Pix)  Q)      ~(Pv(Ex)  ~Q) 


f^^ 


r^^ 


Although  we  took  the  statements  on  the  right  to  be  the  duals  of  the 
statements  on  the  left,  we  see  that  the  statements  on  the  left  are  duals  of 
the  statements  on  the  right.  Indeed,  inspection  of  our  definition  of  a  dual 
makes  it  clear  that,  if  we  take  the  dual  of  a  dual,  we  return  to  the  original 
statement. 

Note  that  the  definition  of  a  dual  is  relative  to  the  P's.  That  is,  in 
forming  a  dual,  we  operate  entirely  outside  the  P's,  and  if  the  P's  them- 
selves have  structure,  we  make  no  changes  within  the  P's. 

One  must  watch  the  omission  of  parentheses  in  forming  duals.  Thus  the 
dual  of  PQyR  is  (~Pv~Q)~P  rather  than  '^Pv~Q~P.  This  is  because 
PQvR  really  means  (PQ)yR  whereas  ^Py^Q'^R  means  ~Pv(~Q~P). 

We  remark  again  that,  for  the  purposes  of  taking  duals,  v  is  counted  as  a 
basic  symbol.  Thus,  although  PvQ  and  ~(~P'^Q)  are  really  the  same, 
for  the  purpose  of  taking  duals  we  count  them  as  different;  indeed  they  have 
different  duals,  to  wit,  ~P~Q  and  ~(PvQ),  respectively.  However, 
though  these  duals  are  different,  they  are  equivalent.  Indeed,  it  will  turn 
out  to  be  the  case  that  equivalent  statements  have  equivalent  duals. 

If  W  contains  parts  of  the  form  P  D  Q  or  P  ^  Q,  these  must  be  expressed 
in  terms  of  '~,  &,  and  v  before  forming  the  dual.  One  could  express  P  D  Q 
as  '^{P'-^Q),  but  a  neater  dual  will  result  if  we  express  P  D  Q  as  '^PyQ. 
Similarly,  it  is  best  to  express  P  =  Q  as  ~PvQ.Pv~Q. 
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We  shall  write  W*  for  the  dual  of  W.  Clearly,  if  W*  and  V*  are  the  duals 
of  W  and  V,  then  Tf *vF*  is  the  dual  of  WV,  W*V*  is  the  dual  of  TTvF, 
(Ea;)  W*  is  the  dual  of  (.r)  W,  {x)  W*  is  the  dual  of  (Ea:)  W,  and  ~IF*  is  the 
dual  of  '^W  except  when  W  consists  of  a  single  P.-.  In  the  latter  case,  the 
dual  of  '-^W  is  the  dual  of  '^P,-  and  is  P.,  whereas  '^W*  is  ^^^^Pi. 

We  now  state  the  duality  theorem. 

**Theorem  VI.6.3.     If  W*  is  the  dual  of  W,  then  \-  ^W  =  IF*. 

Proof.  Proof  by  induction  on  the  number  of  symbols  in  IF,  counting  as  a 
symbol  each  occurrence  of  a  P,  '^,  (x),  or  (Ex).  First  let  IF  consist  of  a 
single  symbol.    Then  IF  is  Pi,  W*  is  -^P,,  and  so 

\-  ^W  ^  IF*. 

Now  assume  the  theorem  true  if  IF  has  k  or  fewer  symbols  where  /c  is  a 
positive  integer  and  let  IF  have  k  -\-  I  symbols.  Then  IF  has  at  least  two 
symbols.  By  our  assumption  on  the  structure  of  IF,  IF  must  have  one  of 
the  forms  ~Q,  QR,  QyR,  (x)  Q,  or  (E.r)  Q,  where  Q  and  R  have  k  or  fewer 
symbols. 

Case  1.     IF  is  '^Q. 

Subcase  I.  Q  is  a  single  symbol,  P,-.  Then  IF*  is  P,-  and  -^IF  is  '^'^P,-. 
But  we  have  |-  ~~P.-  =  P..    That  is 

\-  ^W  =  IF*. 

Subcase  II.  Q  has  more  than  one  symbol.  Then  IF*  is  ^^Q*.  By  the 
hypothesis  of  the  induction,  h  ~Q  =  Q*-  So  by  Thm.VI.6.1,  Part  LVII, 
|-  ~~Q  =  ~Q*.    That  is, 

\-  ^W  ^  IF*. 

Case  2.  IF  is  QR.  Then  IF*  is  Q*yR*.  Moreover,  we  have  |-  ~Q  =  Q* 
and  h  ~P  =  R*.  Since  h  ~(QP)  =  ^Qy^R  by  Thm.VI.6.1,  Part  XXXI, 
we  get  \-  '^(QR)  =  Q*vP*  by  two  uses  of  the  substitution  theorem.    That  is, 

\-  ^W  =  IF*. 

Case  3.  IF  is  QvR.  Similar  to  Case  2,  except  for  using  Part  XXX  of 
Thm.VI.6.1. 

Case  4.  IF  is  (x)  Q.  Then  IF*  is  (Ex)  Q*.  Moreover,  we  have  \-  ~Q  = 
Q*.  By  Thm.VI.6.2,  Part  IV,  \-  ^(x)  Q  ^  (Ex)  ~Q.  By  the  substitution 
theorem,  \-  ~(a:)  Q  =  (Ex)  Q*.    That  is, 

\-  ^W  =  IF*. 

Case  5.  IF  is  (Ex)  Q.  Similar  to  Case  4,  except  for  using  Part  III  of 
Thm.VI.6.2. 
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We  cite  below  numerous  instances  of  the  duality  theorem  which  have 
already  occurred,  together  with  the  W  and  W*  involved  in  each  case. 


Instance 


W 


W 


Thm.VI.6.1,  Part  II  .  . 
Thm.VI.6.1,  Part  XXVIII 
Thm.VI.6.1,  Part  XXIX 
Thm.VI.6.1,  Part  XXX  . 
Thm.VI.6.1,  Part  XXXI 
Thm.VI.6.1,  Part  XLII  . 
Thm.VI.6.1,  Part  LXI  . 
Thm.VI.6.2,  Part  I  .  .  . 
Thm.VI.6.2,  Part  II  .  . 
Thm.VI.6.2,  Part  III  .  . 
Thm.VI.6.2,  Part  IV     .    . 


~p 

P 

~P~Q 

PyQ 

~Pv-Q 

PQ 

PwQ 

~P~Q 

PQ 

^Py^Q 

P^Q 

P  D  Q 

P^Q 

P^Qy^PQ 

(x)  ~P 

(Ex)  P 

(Ex)  ~P 

(x)P 

(Ex)  P 

(x)  '-P 

{x)P 

(Ex)  ~P 

We  note  that  from  the  duality  theorem  it  follows  that,  if  two  statements 
are  equivalent,  then  their  duals  are  likewise  equivalent.  For  if  |-  T'T  =  V, 
then  \-  '-^W  =  -^F,  and  this  with  [-  ^W  ^  W*  and  \-  '^V  =  V*  gives 
\-W*  =  V*. 

A  rather  more  startling  and  important  consequence  of  the  duality 
theorem  is  embodied  in  the  following  theorem,  which  we  shall  call  the  corol- 
lary to  the  duality  theorem. 

Theorem  VI.6.4.  Let  Pi,  P2,  .  .  .  ,  P„  be  statements,  and  let  W  and  V 
be  built  up  out  of  some  or  all  of  the  P's  by  use  of  '~,  &,  v,  (x),  and  (Ex), 
where  we  may  use  each  P  as  often  as  desired,  and  may  use  whatever  vari- 
ables we  like  in  the  (v)  and  (Ex),  and  as  often  as  desired.  Let  X  and  Y  be 
the  results  of  replacing  &  by  v,  v  by  &,  (x)  by  (Ex),  and  (Ex)  by  (x)  in 
W  and  V,  respectively.  If  |-  IF  =  V,  and  if  this  would  continue  to  hold 
if  we  replace  each  P.-  by  '^Pj,  then  \-  X  =  Y. 

Proof.  Since  \-W=V  continues  to  hold  if  we  replace  each  P<  by  -^P,-, 
let    us   make   this   replacement.      Call   the   result    \-  Wi   =   Vi.      Then 

\-  ~TFi   =   ^V^.     However,  | W,   =   W,*  and  | V,    =    \\*.     So 

\-Wi*  =  Fi*.    However,  clearly  Wi*  is  X,  and  Fi*  is  F,  and  our  theorem 
is  proved. 

In  most  of  the  cases  encountered  so  far  in  which  |-  IF  =  F  has  been 
proved,  it  holds  no  matter  what  we  substitute  for  the  P.-,  and  therefore 
certainly  in  case  we  merely  put  '^P,  for  each  P,-.  In  such  a  general  case, 
\-  X  =  Y  likewise  holds  no  matter  what  we  substitute  for  the  P,-,  as  one 
can  easily  see  by  going  through  the  proof  of  Thm.VI.6.4.    Notice  that  the 
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same  changes  which  we  appHed  to  W  and  V  to  get  X  and  Y  will,  if  applied 
in  turn  to  X  and  Y,  give  us  W  and  V  back  again.  So  the  relation  between 
\-  W  =  V  and  \-  X  =  F  is  a  reciprocal  one.  If  either  holds  regardless  of 
what  we  substitute  for  the  P.,  then  we  can  deduce  the  other  by  the  theorem 
just  proved. 

We  have  already  given  many  pairs  of  equivalences  related  as  IF  =  F  and 
X  =  F  are  related.  For  instance,  among  the  parts  of  Thm.VT.6.1  occur 
the  pairs  I  and  III,  IV  and  V,  VI  and  VII,  VIII  and  IX,  XIV  and  XV, 
XVI  and  XVII,  XVIII  and  XIX,  XXVIII  and  XXIX,  XXX  and  XXXI, 
XXXII  and  XXXIII,  XXXVI  and  XXXVII,  and  XL  and  XLI.  Also 
in  Thm.VI.6.2,  Parts  I  and  II  are  a  pair  and  Parts  III  and  IV  are  a  pair. 

Theorem  VI.6.5. 
*I.  h  {x).PQ:  =  :{x)  P.{x)  Q. 
ni.   h  (Ex).PyQ:  ^  :(Ex)  Pv(Ex)  Q. 

Proof  of  Part  I.  Thm.VI.5.2  gives  half  of  Part  I.  To  get  the  other 
half,  it  suffices  to  prove 

{x)P.{x)Q\-PQ, 

since  we  can  then  apply  successively  the  generalization  theorem,  and  the 
deduction  theorem.  If  we  start  with  {x)  P.{x)  Q,  we  get  (x)  P  and  (x)  Q 
by  Thm.IV.4.18,  then  P  and  Q  by  Thm.VI.5.1,  and  finally  PQ  by  Thm. 
IV.4.22. 

Part  II  follows  from  Part  I  by  the  corollary  to  the  duality  theorem. 
Theorem  VI.6.6.     If  there  are  no  free  occurrences  of  x  in  Q,  then : 
I.  ^(x)Q^Q. 
II.  h  (Ex)  Q^Q. 

nU.   f-  (x).PQ:  ^  :(x)  P.Q. 
*IV.   (-  (Ex).PyQ:  =  :(E.r)  P.wQ. 

*V.   h  (x).PyQ:  ^  :(x)  P.yQ. 
*Yl.  h  (Ex).PQ:  =  :(Ex)  P.Q. 
*yil.  \-  (x).P  D  Q:  ^  :(Ex)  p.  D  Q. 
VIII.  h  (Ex).P  D  Q:  ^  :(x)  p.  D  Q. 
*IX.  h  (x).Q  D  P:  =  :Q  D  (x)  P. 

X.  h  (Ex).Q  D  P:^  :Q  D  (Ex)  P. 
Proof  of  Part  I.     By  Thm.VI.5.1,  \-  (x)  Q  D  Q,  and  by  Axiom  scheme  5, 
[-QO  (x)  Q. 

Part  II  follows  from  Part  I  by  the  corollary  to  the  duality  theorem. 
Note  that  Part  I  does  not  hold  regardless  of  what  we  substitute  for  Q,  since 
it  would  not  hold  if  we  substitute  for  Q  a  statement  with  free  occurrences 
of  X.  However,  Part  I  will  still  hold  if  we  replace  Q  by  '^Q,  since  if  Q  has 
no  free  occurrences  of  x,  '^Q  likewise  has  no  free  occurrences  of  x.  Hence 
we  can  apply  Thm. VI. 6. 4. 
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Proof  of  Part  III.  By  Part  I  and  the  substitution  theorem,  we  can  re- 
place (x)  Q  by  Q  in  Part  I  of  Thm.VI.6.5. 

Part  IV  follows  from  Part  III  by  the  corollary  to  the  duality  theorem. 

Proof  of  Part  X.  Put  ~Q  for  Q  in  Part  IV  and  use  Thm.VI.6.1,  Part 
XLIII. 

Proof  of  Part  IX.  By  Axiom  scheme  4,  \-  (x).Q  D  P:  D  -.(x)  Q.  D  .{x)  P. 
So  by  Part  I  and  the  substitution  theorem,  we  get 

h  (x).Q  D  P:D  :QD  (x)  P. 
To  get  the  converse,  note  that,  by  Thm.VI.5.1, 

QD  ix)P\-Q  D  P. 
So,  by  the  generalization  theorem  and  the  deduction  theorem, 

^Q  D  (x)P:D  :ix).Q  D  P 

Proof  of  Part  VIII.  Put  ~P  for  P  in  Part  IV  and  use  Thm.VI.6.1, 
Part  XLIII  and  Thm.VI.6.2,  Part  IV. 

Proof  of  Part  VII.  Put  ~P  and  ~Q  for  P  and  Q  in  Part  IX  and  use 
Thm.VI.6.1,  Parts  XLIV  and  XLVI. 

Proof  of  Part  V.  Replace  Q  by  '^Q  in  Part  IX  and  use  Thm.VI.6.1, 
Parts  XLIII  and  II. 

Part  VI  follows  from  Part  V  by  the  corollary  to  the  duality  theorem. 

This  theorem  which  we  have  just  proved  tells  us  what  we  can  do  with  a 
quantified  statement  when  part  of  the  statement  does  not  involve  the  vari- 
able of  quantification. 

Theorem  VI.6.7. 
**I.  h  ix){y)  P  =  (y)(x)  P. 
**II.  \-  ('Ex)(Ey)  P  ^  (Ey){Ex)  P. 

Proof  of  Part  I.     By  two  uses  of  Thm.VI.5.1, 

{x)(y)P\-P. 
So,  by  two  uses  of  the  generalization  theorem, 

{x)(y)P'riy)(^)P- 
Then,  by  the  deduction  theorem, 

\-ix)(y)P  D  {y){x)P. 
In  a  similar  manner,  we  prove 

V{y){x)P  D  (x){y)P. 
Part  II  follows  from  Part  I  by  the  corollary  to  the  duality  theorem. 
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The  parts  of  this  theorem  would  ordinarily  be  abbreviated  to  |-  {x,y)  P  = 
{y,x)  P  and  \-  (Ex,^/)  P  =  {^y,x)  P. 
It  is  clear  that  the  proof  we  gave  for  Part  I  would  prove 

h  ix,y,z)  P  =  {y,z,x)  P 

or  any  similar  generalization  of  Part  I.  From  such  a  generalization  of 
Part  I,  we  can  deduce  the  corresponding  generalization  of  Part  II.  We 
shall  feel  free  to  use  such  generalizations  in  the  future,  and  for  justification 
will  merely  refer  to  the  theorem  just  proved. 

Theorem  VI.6.8. 
*n.  h  {x)  F(x)  =  (y)  F(y). 
**II.  h  (m  F{x)  ^  (J^y)  F(y). 

By  our  convention  with  regard  to  the  use  of  F(x)  and  F{y),  this  theorem 
is  to  be  considered  as  a  shorthand  way  of  stating  the  following  result. 

Theorem  VI.6.8.     Let  x  and  y  be  variables  and  P  and  Q  be  statements. 
Let  Q  be  the  result  of  replacing  all  free  occurrences  of  a;  in  P  by  occurrences 
of  y,  and  P  be  the  result  of  replacing  all  free  occurrences  oi  y  inQ  by  occur- 
rences of  x.      Then: 
I.  [-{x)P^  (y)  Q. 
11.  h  (E.T)  P  ^  (E2/)  Q. 

Proof  of  Part  I.     By  Thm.VI.2.1,  the  replacement  of  x  by  y  to  go  from 
F(x)  to  F{y)  causes  no  confusion,  and  so 

h  (x)  F(x)  D  F(y) 

is  an  instance  of  Axiom  scheme  6.    So 

(x)F{x)\-F(y). 

Clearly  there  are  no  free  occurrences  of  y  in  (x)  F{x),  for  if  y  is  the  same 
variable  as  x,  then  the  {x)  in  front  binds  all  occurrences,  whereas  if  ?/  is  a 
variable  different  from  x,  then  by  our  construction  of  F{x)  from  F{y)  there 
are  no  free  occurrences  of  y  in  F{x)  and  hence  none  in  {x)  F{x).  So  we  can 
apply  the  generalization  theorem  to  get 

{x)  F(x)  h  (y)  F{y  . 

Then  the  deduction  theorem  gives 

h  {x)  F(x)  D  (y)  F(y). 

In  a  similar  manner,  we  get 

h  (y)  F{y)  D  (x)  F(x). 

Part  II  follows  from  Part  I  by  the  corollary  to  the  duality  theorem,  but 
in  order  to  be  sure  that  we  do  not  have  any  troubles  about  confusion  of 
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variables,  it  is  perhaps  worth  while  to  give  a  detailed  proof.  If  F{x)  and 
F{y)  satisfy  the  specified  requirements  on  variables,  so  do  ^F(x)  and 
~F(y).    So  by  Part  I 

h  (x)  '^Fix)  ^  (y)  ^F(y). 
Then 

h  ~(a;)~/^(t/)  =  '^{y)^F(y) 

by  Thm.VI.6.1,  Part  LVII.    This  is  Part  II. 

Because  of  this  theorem,  we  can  usually  avoid  having  a  statement  in 
which  a  given  variable  has  both  free  and  bound  occurrences.  Thus, 
instead  of  F(x)y(x)  G{x)  we  can  use  the  equivalent  statement  F(x)y{y)  G(y), 
where  we  are  careful  to  choose  a  y  which  does  not  occur  in  F{x). 

The  duality  theorem  is  often  very  useful  in  connection  with  proofs  by 
reductio  ad  absurdum.  We  recall  that  the  standard  procedure  for  proving 
P  by  reductio  ad  absurdum  is  to  assume  '^P  and  deduce  Q'^Q.  Because 
of  the  duality  theorem  we  now  have  available  three  variants  of  this,  namely : 

"Assume  P*  and  deduce  Q'^Q." 

"Assume  ~P  and  deduce  QQ*." 

"Assume  P*  and  deduce  QQ*." 

If  P  is  a  direct  statement  (that  is,  P  does  not  begin  with  a  ^),  then  ~'P 
is  a  negative  statement  and  so  is  awkward  to  deal  with,  whereas  P*  is  a 
direct  statement  and  generally  more  agreeable  to  work  with.  Thus, 
suppose  P  is  the  statement  that  a  function  is  uniformly  continuous  in  an 
interval  (a,6),  which  we  earlier  wrote  out  (see  page  98).  Then  if  we  wish 
to  prove  by  reductio  ad  absurdum  that  P  is  true,  it  will  be  much  more  effec- 
tive to  assume  P*  than  ~P.    We  write  P*  below: 

(Ee)::£  >  0:.(5):.5  >  0.  D  :(Ex):a  <x<  h:(Ey).a  <  y  <  h.\  y  -  X  \  <  8. 

-(I  f{y)  -  J{x)  I  <  e). 

If  we  take  |  j{y)  —  j{x)  |  >  s  as  equivalent  to  '^(l  j{y)  —  f{x)  \  <  s), 
and  note  that  we  can  put  a  <  x  <  b  inside  (Ey)  since  there  are  no  free  y's 
in  a  <  x  <  h  (see  Thm.VI.6.6,  Part  VI),  we  can  write  P*  in  the  following 
equivalent  form 

(Ee):£  >  0:(5):5  >  0.  D  .(Ea:,y).a  <  x  <  h.a  <  y  <  b.\  y  -  x  \  <  S. 

I  fiy)  -  fix)  \>e. 

This  is  clearly  a  direct  statement  from  which  one  could  expect  to  prove 
numerous  consequences,  and  is  quite  a  contrast  to  the  purely  negative 
statement  '^P. 

Although  the  duality  theorem  is  not  widely  known  among  mathemati- 
cians, an  experienced  analyst  will  have  woiked  out  enough  special  cases  of 
it  so  that  derivation  of  the  above  statement  as  a  way  of  saying  that  the 
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function  /  is  not  uniformly  continuous  in  the  interval  {a,b)  will  not  seem 
startling.  However,  the  novice  finds  such  transformations  fairly  difficult 
and  would  be  helped  by  a  knowledge  of  the  duality  theorem. 

EXERCISES 
VI.6.1.     Write  the  duals  of : 

(a)  {x):.P  D  :{Ey):Q:(z).R  D  S. 

(b)  (Ex).Py(y)  Q:(u)iEv)  r-^R. 

(c)  P  D  (x)  QviEy)  RS. 

VI.6.2.     Write  out  a  complete  demonstration  of  (x)  F(x)  [-  (y)  F(y). 
VI.6.3.     Prove: 

(a)  h  (x)  Py{x)  Q:  D  :(x).PyQ. 

(b)  I-  ('Ex).PQ:  D  :{Ex)  P.(Ex)  Q. 

VI.6.4.  Find  examples  to  show  why  one  would  not  expect  to  be  able  to 
prove  either  of : 

(a)  (x).PyQ:  D  :{x)  Pw(x)  Q. 

(b)  (Ex)  P. (Ex)  Q:  D  :iEx).PQ. 

VI.6.5.     Prove: 

(a)  h  (x).P  D  Q:  D  :(Ex)  P  D  (Ex)  Q, 

(b)  h  (x)-P  ^  Q:  ^  :(Ea:)  P  ^  (Ex)  Q. 

VI.6.6.     Write  out  a  complete  demonstration  of  \-  (x)(y)  P  D  (x)  P. 
VI.6.7.     Write  out  a  complete  demonstration  of  (x)(y)  P  \-  (y)(x)  P. 
VI.6.8.     Supply  the  conditions  on  free  and  bound  variables  needed  to 
make  the  following  statement  valid,  and  prove  the  resulting  valid  statement 

h  (Ex)  Fix).  ^  .(Ex,y).F(x)yFiy). 

VL6.9.  Let  Pi,  .  .  .  ,  P„  be  statements  (which  may  contain  free  occur- 
rences of  various  variables)  and  let  W  be  built  up  out  of  the  P's  by  means 
of  '^,  &,  and  (x)  for  various  x's.  Prove  that  there  are  statements 
Qi,  .  .  .  ,  Qm,  which  are  the  same  as  the  P's  or  are  got  from  them  by  replacing 
various  free  occurrences  of  variables  by  occurrences  of  other  variables,  and 
there  is  a  F  built  from  the  Q's  by  means  of  '^  and  &  alone  (not  using  any 
quantifiers)  and  there  is  a  string  of  quantifiers  (Q)  (some  existential  and 
some  universal)  such  that  [-  TT  =  (Q)  V.  (Hint.  Use  induction  on  the 
number  of  symbols  in  W.) 

7.  The  Formal  Analogue  of  an  Act  of  Choice.  The  statement 
Pi,  P2,  .  .  .  ,  PnV  Q  means  in  effect  that  we  can  get  from  the  P's  and  the 
axioms  to  Q  by  successive  uses  of  modus  ponens.    Now  modus  ponens  is  a 
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rule,  "If  P  and  P  D  Q,  then  Q/'  by  which  one  can  go  from  valid  statements 
to  a  new  valid  statement. 

One  might  ask  why  we  restrict  ourselves  to  use  of  the  single  rule  of  modus 
ponens.  Actually,  in  effect  we  do  not  so  restrict  ourselves.  Any  time  one 
proves  a  theorem  such  as  |-  ~~P  D  P  or  |-  P  D  Q.  D  .~Q  D  ~P,  one 
has  in  effect  an  additional  rule  such  as  ''If  ~~P,  then  P"  or  "If  P  D  Q, 
then  '-^Q  D  ^^P."  However,  these  additional  rules  merely  involve  the  use 
of  modus  ponens  with  a  proved  theorem  of  the  form  |-  T-F  D  V,  and  there 
seems  no  point  in  stating  such  formal  rules  as  long  as  we  have  modus  ponens 
and  have  stated  the  corresponding  theorem.  Even  more  complicated  rules 
can  easil}^  be  got  from  proved  theorems  by  modus  ponens.  Thus,  from 
Thm.IV.4.22,  \-  P  ^  (Q  ^  PQ),  by  two  uses  of  modus  ponens,  we  infer  the 
rule  "If  P  and  Q,  then  PQ." 

However,  there  are  some  useful  rules  which  cannot  be  got  by  combining 
modus  ponens  with  a  proved  theorem  of  the  form  \-WDV.  Such  a  one  is 
justified  by  Thm.VI.4.1.  The  corresponding  rule  would  be  "If  P,  then 
(x)  P."  We  shall  refer  to  it  as  the  generalization  rule,  or  more  shortly  as 
"rule  G."  This  cannot  be  derived  by  modus  ponens  from  [-  P  D  (x)  P, 
because  this  statement  is  false  so  far  as  we  know.  Certainly  it  should  be 
false,  for  if  it  were  true  we  could  infer  \-  {x).P  D  (x)  P  by  Thm.VI.4.1,  and 
then  get  \-  (Ex)  P  D  (x)  P  by  Thm.VI.6.6,  Part  VII.  As  (Ex)  P  D  (x)  P 
represents  a  false  statement,  [-  (Ex)  P  D  (x)  P  certainly  should  be  false. 

The  fact  that  we  are  using  only  the  rule  of  modus  ponens  appears  in  our 
definition  of  |-.  If  we  admit  the  use  of  rule  G  as  well  as  modus  ponens,  we 
must  change  our  definition  of  [-.  Notice  further  that  we  cannot  permit 
unrestricted  use  of  rule  G.  This  is  indicated  in  Thm.VI.4.2,  where  one  is 
permitted  to  use  rule  G  in  case  x  has  no  free  occurrences  in  any  of  Pi,  Pa, 
.  .  .  ,  Pn.  With  this  restriction  in  mind,  we  state  a  definition  for  Pi,  Po,  .  .  .  , 
Pn  he  Q)  which  will  make  it  say  in  effect  that  we  can  get  from  the  P's  and 
the  axioms  to  Q  by  successive  uses  of  the  two  rules,  modus  ponens  and  rule 
G.  Note  that  we  reserve  j-  for  demonstrations  which  use  only  modus 
ponens,  and  write  |-g  whenever  we  permit  uses  of  rule  G. 

The  precise  definition  is  as  follows : 

Pi,  P2,  .  .  .  ,  Pn\-G  Q  indicates  that  there  is  a  sequence  of  statements 
Si,  S2,  .  .  •  ,  S,,  such  that  S,  is  Q  and  for  each  Si  either: 

(1)  >Si  is  an  axiom. 

(2)  Si  is  a  P. 

(3)  There  is  a  j  less  than  i  such  that  Si  and  S,-  are  the  same. 

(4)  There  are  j  and  k,  each  less  than  i,  such  that  Sk  is  S,-  D  Si. 

(5)  There  is  a  variable  x,  which  does  not  occur  free  in  any  of  Pi,  P2, 
.  .  .  ,  P„,  and  a  j  less  than  i  such  that  Si  is  (x)  S,. 
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We  call  attention  to  the  fact  that  (1)  to  (4)  are  exactly  as  in  the  definition 
of  [-.  It  is  condition  (4)  that  permits  the  use  of  modus  ponens,  and  condi- 
tion (5)  that  permits  the  use  of  rule  G.  The  sequence  of  *S"s  is  called  a 
demonstration  of  Pi,  Pa,  •  •  •  ,  Pn  Va  Q,  and  the  S's  are  called  steps. 

We  characterize  the  various  steps  *S,  according  to  which  of  the  five  cases 
they  come  under.  S's  covered  by  (1)  are  axioms,  *S's  covered  by  (2)  are 
P's,  S's  covered  by  (3)  are  repetitions,  S's  covered  by  (4)  are  results  of 
modus  ponens,  and  S's  covered  by  (5)  are  results  of  rule  G.  In  general  any 
given  step  will  come  under  only  one  case.  However,  one  can  conceive  of 
artificial  situations  in  which  a  step  could  be  justified  by  either  of  two  cases. 
In  such  a  case,  we  shall  take  as  the  justification  that  case  having  the  smaller 
number.  Having  thus  assured  that  there  is  a  unique  case  assigned  to  each 
step,  we  can  now  speak  of  the  number  of  uses  of  modus  ponens  and  the 
number  of  uses  of  rule  G.  To  be  explicit,  each  Si  which  comes  under  case 
(4)  constitutes  a  use  of  modus  ponens,  and  each  Si  which  comes  under 
case  (5)  constitutes  a  use  of  rule  G.  We  can  further  speak  of  the  first, 
second,  third,  etc.,  uses  of  modus  ponens  or  rule  G.  Moreover,  whenever 
we  use  rule  G  to  get  a  step  Si,  the  Si  has  the  form  (x)  S,-,  and  the  use  of 
rule  G  consisted  in  attaching  the  (x)  in  front  of  the  Sj.  The  x  is  said  to  be 
the  x  involved  in  this  use  of  rule  G. 

Theorem  VI.7.1.     If  P„  P„  .  .  .  ,  P„  he  Q,  then  P„  P^,  ...  ,  P.^Q. 

Proof.  Proof  by  induction  on  the  number  of  uses  of  rule  G  in  the  demon- 
stration of  Pi,  Pa,  .  .  .  ,  Pn\-G  Q-  If  there  are  zero  uses,  then  the  theorem  is 
obvious.  Let  us  now  assume  the  theorem  true  whenever  there  are  n  or 
fewer  uses  of  rule  G  in  the  demonstration  (n  >  0),  and  let  us  have  a  demon- 
stration with  n  +  1  uses  of  rule  G.  Let  S^  be  the  first  step  in  the  demon- 
stration for  which  rule  G  was  used.  Then  there  is  a  /3  less  than  a  and  a 
variable  x,  which  has  no  free  occurrences  in  any  of  Pi,  Pa,  .  .  .  ,  P„,  such 
that  Sa  is  (x)  Sp.  Since  S^  is  the  first  use  of  rule  G,  the  steps  Si,  S2,  .  .  .  ,  S^ 
constitute  a  demonstration  of  Pj,  Pa,  .  .  .  ,  Pn[-  S^.  Because  there  are  no 
free  occurrences  of  x  in  any  of  Pi,  Pa,  .  .  .  ,  P„,  it  follows  by  Thm.VI.4.2  that 
there  is  a  demonstration  of  P,,  Pa,  .  .  .  ,  P„  |-  (x)  S^.  Let  T^,  T^,  .  .  .  ,  Tt 
be  this  demonstration.  Then  Tt  is  (x)  Sp  which  is  5„.  Now  in  the  demon- 
stration Si,  S2,  .  .  .  ,  S,,  replace  the  single  step  S^  by  the  sequence  of  steps 
Ti,  T2,  .  .  .  ,  Tf  As  Tt  is  Sa,  we  still  have  all  the  S's  present,  and  in  their 
original  order,  so  that  we  have  another  demonstration  of  Pj,  Pa,  .  .  •  , 
P„  \-o  Q.  However,  whereas  formerly  S^  was  derived  from  S^  by  a  use  of 
rule  G,  now  *Sa  is  derived  without  a  use  of  rule  G  by  means  of  the  steps 
Ti,  jTa,  .  .  .  ,  T t.  So  our  new  demonstration  has  orAy  n  uses  of  rule  G,  and 
by  our  hypothesis  of  induction  there  is  a  demonstration  of  Pi,  Pa,  .  .  .  , 

P.VQ- 

This  theorem  is  a  generalized  form  of  Thm.VI.4.2.    Thm.VI.4.2  covered 
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the  case  where  there  is  a  single  use  of  rule  G,  and  this  occurs  in  the  last  step 
of  the  demonstration. 

Thm.VI.7.1  says  that  whatever  can  be  proved  by  use  of  rule  G  can  be 
proved  without  it.  Hence  we  do  not  really  need  rule  G  and  are  justified  in 
using  merely  modus  ponens.  On  the  other  hand,  Thm.VI.7.1  also  justifies 
the  point  of  view  that  we  might  as  well  use  rule  G,  since  it  will  not  lead  to 
any  results  that  we  could  not  get  without  it.  This  is  our  point  of  view. 
In  general,  a  demonstration  of  Pi,  P2,  •  .  •  ,  PnVo  Q  will  have  many  fewer 
steps  than  the  corresponding  demonstration  of  Pi,  Pa,  .  .  .  ,  P„  [-  Q.  Thus 
it  is  easier  to  find  a  demonstration  of  Pi,  Pa,  .  .  .  ,  P„  |-g  Q,  in  spite  of  the 
fact  that  we  are  assured  by  Thm.VI.7.1  that  there  must  be  a  demonstration 
of  Pi,  P2,  .  .  .  ,  Pn\-  Q  whenever  there  is  a  demonstration  of  Pi,  Pj,  .  .  .  , 

Pn  Vo   Q. 

There  is  another  rule  which  is  also  very  useful.  Like  rule  G,  we  do  not 
really  need  it,  in  the  sense  that  we  can  always  manage  to  do  without  it; 
however,  we  manage  to  do  without  it  only  at  the  cost  of  making  our  demon- 
strations much  more  lengthy  and  laborious.  So,  as  with  rule  G,  we  shall 
use  this  other  rule  but  shall  prove  a  theorem  to  the  effect  that  whatever 
results  we  get  by  using  it  can  be  got  by  the  single  rule  of  modus  ponens.    . 

In  order  to  motivate  this  rule,  let  us  look  at  an  instance  of  its  use  in 
everyday  mathematics.  On  pages  14  to  15  of  Bocher,  1907,  is  proved  the 
theorem  that,  if  two  functions  are  continuous  at  a  point,  their  sum  is 
continuous  at  this  point.  We  now  quote  this  proof  (except  for  the  minor 
change  of  using  functions  of  a  single  variable,  instead  of  functions  of  several 
variables).^ 

"Let  /i  and  /a  be  two  functions  continuous  at  the  point  c  and  let  k^  and  h^ 
be  their  respective  values  at  this  point.  Then,  no  matter  how  small  the 
positive  quantity  e  may  be  chosen,  we  may  take  5i  and  Sa  so  small  that 

I  /i  —  A'l  I  <  ^£        when  \  x  —  c\  <  81, 

I  /a  —  /ca  I  <  |e         when  \  x  —  c  \  <  82. 

Accordingly 

I  /i  —  ^1  I  +  I  /2  —  ^2  I  <  £        when  \  X  —  c  \  <  5, 

where  8  is  the  smaller  of  the  two  quantities  81  and  82;  and  since 

\A\-{-\B\>\A  +  B\, 
we  have 

I  /i  -  fci  +  /2  —  ^2  I  =  I  Cfi  +  /2)  -  (^1  +  ^2)  I  <  £    when  \  X  -  c\  <  8. 

Hence  /i  +  .f2  is  continuous  at  the  point  c." 

^  From  B6cher,  op.  cit. 
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Let  us  focus  our  attention  on  the  phrase  ''we  may  take  5i  and  62  .  .  .  ". 
We  had  the  hypothesis  that  /i  and  J2  are  continuous  at  c,  namely,  (for 
i  =  1,  2) 

(e)::e  >  0.  D  :.(E5):.5  >  0:(a:):|  X  -  c  \  <  8.  D  .\  /,(x)  -  A;.-  |  <  e. 

We  have  ''chosen"  an  arbitrary  positive  e  and  taken  the  e  of  our  hypoth- 
eses to  be  e/2,  and  so  now  have 

(VI.7.1)       (E8):.8  >  0:(x):\  x  -  C  \   <  8.  D  .\  /,(x)  -  A;.-  |   <  K 

Now  what  is  the  procedure  by  which  we  "take  5i  and  82"?  Clearly  it 
cannot  be  the  same  that  we  used  when  we  "chose"  a  positive  e.  There  are 
no  restrictions  on  e.  We  are  free  to  choose  any  one  that  pleases  us.  Our 
own  convenience  is  the  only  criterion.  This  is  not  the  case  when  we  come 
to  "take"  a  8.  The  statement  (VI.7.1)  assures  us  that  there  are  5's  having 
the  property  that  we  wish.  However,  it  gives  us  no  criteria  for  choosing 
them. 

The  usual  mathematical  treatment  offers  no  solution  to  this  impasse. 
The  standard  procedure,  as  illustrated  by  Bocher's  proof,  is  to  assume  that 
somehow,  by  unexplained  magic,  we  have  got  hold  of  the  5's  we  wish,  and 
then  to  proceed  from  there.  From  the  point  of  view  of  symbolic  logic,  the 
step  amounts  to  proceeding  (without  explanation  or  justification)  from 
(VI.7.1)  to 

8i  >  0:{x):\  X  -  c  \  <  8i.  D  .\  f i{x)  -  A;.-  |  <  |s. 

That  is,  at  this  step,  use  is  made  of  a  rule  "If  (Fix)  F{x),  then  F{y)." 
In  the  case  at  hand,  the  rule  is  used  twice,  once  to  go  from  (E8)Fi{8)  to 
7^1  (5i),  and  once  to  go  from  (E5)  F2(8)  to  ^2(62). 

It  is  not  our  intent  to  furnish  any  philosophical  justification  for  such  a 
rule.  We  merely  observe  that  it  is  commonly  used  in  mathematics,  and  so 
we  seek  to  justify  its  use  in  symbolic  logic.  Actually,  if  we  succeed  in 
justifying  its  use  in  symbolic  logic,  where  nothing  more  extravagant  than 
modus  ponens  has  been  assumed,  this  will  furnish  justification  of  a  sort  for 
its  use  in  everyday  mathematics. 

What  is  involved  psychologically  in  going  from  (Ex)  F{x)  to  F{y)  seems 
to  be  the  following.  (Ea;)  F{x)  says  that  there  is  at  least  one  x  which  makes 
F{x)  true.  From  among  such  a;'s,  let  us  "choose"  one,  and  call  it  y.  Then 
y  is  an  unknown,  fixed  quantity  having  the  property  F{y). 

Explaining  the  step  from  (Ex)  F(x)  to  F(y)  as  depending  on  an  act  of 
choice  does  not  justify  it,  but  it  does  generate  certain  inhibitions  about  the 
use  of  the  step  which  prevent  us  from  using  it  improperly.  For  it  is  not 
always  permissible  to  go  from  (Ex)  F(x)  to  F(y),  and  without  some  criterion 
for  when  the  step  is  suitable,  we  should  get  into  trouble  by  using  it. 
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Since  the  rule  ''If  (Ex)  F{x),  then  FiyY'  corresponds  to  a  hypothetical 
act  of  choice,  we  shall  call  it  the  rule  of  choice,  or  more  briefly,  rule  C. 
In  the  symbolic  logic,  where  reference  to  meaning  is  not  permitted,  we  must 
manage  somehow  to  state  the  conditions  for  the  use  of  rule  C  in  a  manner 
which  depends  only  on  the  forms  of  the  statements  involved.  This  we  shall 
do. 

We  must  realize  first  of  all  that  rule  C,  like  rule  G  and  modus  ponens, 
is  something  which  is  used  in  a  demonstration.  We  shall  make  this  more 
explicit  later  by  generalizing  the  notion  |-  again.  Accordingly,  the  restric- 
tions on  rule  C  are  on  its  use  in  a  demonstration  and  depend  on  the  particu- 
lar demonstration  being  considered.  Thus  a  certain  use  of  rule  C  may  be 
legitimate  in  one  demonstration  and  not  legitimate  in  a  second  demonstra- 
tion. We  have  encountered  this  situation  already  in  connection  with  rule 
G.  If  X  has  no  free  occurrences  in  P  but  does  have  free  occurrences  in  R, 
one  could  use  rule  G  to  go  from  8  to  {x)  5  in  a  demonstration  of  P  [-q  Q, 
but  not  in  a  demonstration  oi  R\-a  Q. 

In  order  to  formulate  the  restrictions  on  rule  C,  let  us  analyze  the  inhibi- 
tions inherent  in  thinking  of  it  as  an  act  of  choice.  In  the  first  place,  if  we 
have  (Ex)  F{x)  and  ''choose"  a,  y  so  that  F(y),  then  our  y  is  not  only  fixe4 
and  unknown,  but  restricted.  It  cannot  be  just  any  quantity,  as  some 
unknowns  can.  Hence  we  cannot  expect  to  use  the  generalization  principle 
with  this  y.  Or  to  put  it  in  formal  terms,  if  we  use  rule  C  to  get  from 
(E.t)  F(x)  to  F(y)  with  some  y,  then  we  cannot  later  in  our  demonstration 
use  rule  G  to  go  from  G(y)  to  (y)  G(y)  with  this  same  y.  This  restriction 
on  later  uses  of  rule  G  does  not  apply  merely  to  the  variable  y,  but  to  every 
variable  which  occurs  free  in  F(y).  This  may  seem  unnecessarily  severe, 
but  if  we  have  a  look  at  the  meanings  involved,  we  see  that  it  is  inescapable. 
Let  F(x,z)  denote:  "x  is  a  prime  greater  than  the  prime  z."  Then 
(Ex)  F{x,z)  is  true  for  each  prime  z  and  is  merely  a  statement  of  Euclid's 
theorem  that  there  are  an  infinity  of  primes.  Now  "choose"  a  y  such  that 
F(y,z).  We  certainly  have  restricted  y  by  so  doing,  but  we  have  also 
restricted  z.  For  whereas  (Ex)  F(x,z)  is  true  for  every  prime  z,  F(y,z)  is 
true  only  for  those  prime  ^'s  which  are  less  than  y,  and  it  would  now  be 
wholly  inappropriate  to  use  rule  G  with  z. 

In  applying  rule  C  to  get  F(y)  from  (Ex)  F(x),  certain  precautions  on  the 
choice  of  the  letter  y  must  be  observed.  Note  that  the  choice  of  a  letter  y 
to  appear  in  F(y)  is  quite  a  different  matter  from  actually  choosing  one  of 
the  quantities  which  make  F{x)  true.  The  letter  y  does  not  make  F{y)  true. 
It  merely  stands  in  F(y)  and  denotes  some  quantity  which  makes  F{x) 
tme.  The  letter  y  is  a  name,  and  if  we  are  ignoring  meanings  (as  we  are) 
we  can  choose  a  name  without  choosing  the  quantity  of  which  it  is  a  name. 
Our  restrictions  on  y  are  essentially  that  it  should  not  be  a  name  that  has 
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already  been  assigned  to  some  other  fixed  or  restricted  quantity,  for  if  it 
were,  this  would  amount  to  assuming  that  the  quantity  which  we  are 
choosing  as  one  of  those  which  make  F{x)  true  happens  to  be  a  previousl}^ 
assigned  quantity;  certainly  a  dubious  assumption. 

Formally,  the  restriction  on  the  letter  y  when  we  go  from  (Ex)  F{x)  to 
F{y)  by  rule  C  is  merely  the  following.  In  the  first  place,  y  should  not 
occur  free  in  any  of  the  P's  which  precede  the  \-  sign.  These  P's  are  all 
assumptions,  and  any  y  which  occurs  free  in  a  P  is  therefore  subject  to  a 
restriction.  Furthermore,  if  at  some  previous  point  in  our  demonstration 
we  have  used  rule  C  to  go  from  (Ew)  G{v)  to  G{w),  then  y  must  not  have 
occurred  free  in  G{w).  We  call  attention  to  the  fact  that  Bocher  observed 
this  restriction  in  his  proof.  His  first  use  of  rule  C  was  to  go  from  (E  5)  Fi  ( 5) 
to  Fi(di),  and  his  second  use  was  to  go  from  (E8)  ^2(5)  to  P2(<52),  and  he 
used  different  letters,  5i  and  80,  in  the  two  cases. 

One  final  point,  and  w^e  have  all  the  necessary  restrictions.  If  (Ex)  F(x) 
is  true,  and  we  use  rule  C  to  infer  F(y),  no  one  claims  that  F(y)  is  true.  Al- 
though we  pretend  to  choose  y  so  that  F(y)  is  true,  we  know  that  usually 
we  have  not  sufficient  information  to  allow  us  actually  to  make  such  a 
choice.  So  F(y)  has  a  sort  of  quasi  truth,  in  that,  if  an  omniscient  being 
were  performing  the  proof,  he  could  actually  choose  a  y.  All  subsequent 
statements  which  contain  y  also  have  only  the  same  sort  of  quasi  truth. 
However,  if  we  later  derive  statements  which  do  not  contain  y,  they  do  not 
depend  on  the  actual  choice  of  y,  but  only  on  the  theoretical  possibility  of 
choosing  y,  which  possibility  is  supposedly  guaranteed  by  the  statement 
(Ex)  F(x).  So  statements  which  do  not  contain  y  will  be  genuinely  true. 
To  put  this  in  terms  of  a  demonstration,  if  we  get  Q  from  P,,  P2,  .  .  .  ,  P^ 
and  the  axioms  by  modus  ponens,  rule  G,  and  rule  C,  then  we  only  have 
Pi,  P2,  .  .  .  ,  P„  |-  Q  in  case  Q  does  not  contain  any  y  used  with  rule  C. 

We  have  now  compiled  a  list  of  restrictions  based  on  the  idea  that  rule  C 
is  the  formal  equivalent  of  an  act  of  choice.  We  next  have  to  show  that  this 
list  of  restrictions  is  adequate.  To  show  this,  we  define  Pi,  P2,  .  .  .  ,\-c  Q 
to  mean  that  one  can  get  Q  from  the  P's  and  the  axioms  by  modus  ponens, 
rule  G,  and  rule  C,  subject  to  all  the  stated  restrictions,  and  then  prove  that, 
whenever  we  have  Pi,  P2,  .  .  .  ,  P„  [-c  Q,  then  we  have  Pi,  P2,  .  .  .  ,  P„  |-  Q. 
This  will  then  tell  us  that  we  do  not  need  to  use  rule  C  (or  rule  G),  and 
equally  it  will  tell  us  that  we  may  perfectly  well  use  it  if  we  wish. 

Actually,  it  turns  out  to  be  inconvenient  to  embody  all  our  restrictions 
in  the  definition  of  \-c-  So  we  embody  most  of  our  restrictions  in  the 
definition  of  \-c,  and  insert  the  remaining  restrictions  as  an  additional 
hypothesis  in  the  theorem  which  says  that,  if  Pj,  P2,  •  •  .  ,  Pn  he  Q,  then 

Pi,  P2,   .  .  .  ,  P„  h  Q; 

The  precise  definition  of  \-c  is  as  follows. 
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Pi,  P2,  •  •  •  ,  PnVc  Q  indicates  that  there  is  a  sequence  of  statements 
(Si,  aSz,  .  .  .  ,  aS,,  such  that  S,  is  Q,  and  for  each  Si  either: 

(1)  *Sj  is  an  axiom. 

(2)  Si  is  a  P. 

(3)  There  is  a  j  less  than  i  such  that  Si  and  Sj  are  the  same. 

(4)  There  are  j  and  k,  each  less  than  i,  such  that  S^  is  Sj  D  ^i. 

(5)  There  is  a  variable  x,  which  does  not  occur  free  in  any  of  Pi,  P2, 
.  .  .  ,  Pn,  or  in  any  earlier  step  which  is  a  result  of  rule  C,  and  there  is  a  j  less 
than  i  such  that  Si  is  {x)  S,-. 

(6)  There  are  variables  x  and  y,  not  necessarily  distinct,  such  that  y 
does  not  occur  free  in  any  of  Pi,  P2,  .  .  .  ,  P„,  or  in  any  earlier  step  which  is  a 
result  of  rule  C,  and  there  is  aj  less  than  i  such  that  S,-  is  (Ex)  W  where  W 
is  the  result  of  replacing  all  free  occurrences  of  y  in  Si  by  occurrences  of 
X  and  Si  is  the  result  of  replacing  all  free  occurrences  of  a;  in  TF  by  occur- 
rences of  y. 

We  still  have  to  define  which  steps  are  results  of  rule  C.  These  will  be 
the  steps  covered  by  case  (6)  above,  except  that  as  before  we  arrange  that 
each  step  shall  be  covered  by  a  unique  case  by  agreeing  that,  whenever  a 
step  can  be  justified  by  two  different  cases,  we  shall  take  as  the  justification 
that  case  having  the  smaller  number.  Among  other  things,  this  ensures 
that  we  use  rule  G  and  rule  C  as  few  times  as  possible. 

As  usual,  the  sequence  of  S's  is  called  a  demonstration,  and  the  S's  are 
called  the  steps  of  the  demonstration. 

By  our  conventions,  we  could  shorten  the  latter  portion  of  case  (6)  to : 

"...  and  there  is  a.j  less  than  i  such  that  Sj  is  (Ex)  F{x)  and  Si  is  F(y)." 

In  this  case,  we  say  that  F(y)  is  derived  from  (Ex)  F(x)  by  a  use  of  rule  C 
with  y. 

**Theorem  VI.7.2.  Suppose  that  Pj,  P2,  .  .  .  ,  Pn  \-c  Q.  Furthermore, 
let  yi,  y2,  '  '  '  ,ym  be  the  y's  with  which  rule  C  is  used  in  the  given  demon- 
stration of  Pi,  P2,  .  .  .  ,  P„  \-c  Q-  If  none  of  these  ^'s  occur  free  in  Q, 
then  Pi,  P2,  .  .  .  ,  P„  h  Q- 

Proof.  Let  Si,  S2,  .  .  .  ,  S,  be  the  steps  of  the  given  demonstration  of 
Pi,  P2,  .  .  .  ,  P„  he  Q-  Let  Fiiyi),  F^iy^),  ...  ,  F„{yJ  be  the  steps  which 
are  results  of  rule  C,  in  the  order  in  which  they  occur  in  the  demonstration 
Si,  S2,  .  .  .  ,  Ss,  and  let  yi,  ?/2,  .  .  .  ,  y„  be  the  y's  with  which  these  uses  of 
rule  C  are  made.  Define  at  to  be  the  greatest  a  such  that  the  step  F^iya) 
does  not  occur  later  than  the  step  Si  (for  those  i's  such  that  *S,-  occurs 
before  Pi  (1/1),  we  take  a,-  =  0). 

Lemma  A.     For  I  <  i  <  s, 

Pi,  P2,  .  .  .,  Pn,  Fiiyi),  F,(y2),  ...,  F^Xv.,)  h  Si. 
(If  Si  precedes  Pi (2/1),  then  «<  =  0  and  this  takes  the  form  P,,  P2,  .  .  .  , 

Pn  h   Si.) 
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Proof.  We  prove  this  by  induction  on  i.  Hi  =  1,  then  Si  must  be  an 
axiom  or  a  P,  and  the  lemma  is  obvious.  Now  assume  the  lemma  true  for 
i  <  k,  and  let  *  =  /b  +  1.  If  St  is  an  axiom  or  a  P,  our  lemma  is  obvious 
Let  Si  be  a  repetition  of  an  earlier  Sj.    Then  by  hypothesis, 

Pi,  P„  ...,  P„,  F,(y,),  F,{y,),  ...,  F^^y^,)  \-  S^. 

If  a,  =  ai,  then  we  have  our  result,  since  Si  is  the  same  as  S,.  If  a,-  9^  a.-, 
then  clearly  a,-  <  ai,  since 7  <  i.  So  we  can  get  the  desired  result  by  insert- 
ing the  additional  hypotheses 

/^a,+l(?/a,+l),    ^a,+2(?/a,+2),     •    •    •    ,    F  ^  .(V  a ,) 

before  the  \-  sign. 

If  Si  is  the  result  of  modus  ponens,  then  we  have  S,-  and  Sk,  each  previous 
to  Si,  and  such  that  Sk  is  S,-  D  Si.    So  by  hypothesis  we  have 

P„  P„  .  ..,  P„,  F,(y,),  F.Xy.),  .  .  .  ,  F^,{y^,)  [-  S,-, 

P„  P„  .  .  .  ,  P„,  F,iy,),  F,(y,),  ...,  F^,{y^,)  h  S^  D  Si. 

If  necessary,  we  adjoin  additional  assumptions  before  the  \-  sign,  and  so 
infer 

Pi,  P„  .  .  . ,  P„,  Pi(?/0,  P,(2/2),  .  .  . ,  F^Xy.,)  h  'Sn 

Pi,  P.,  .  .  .  ,  P„,  Pi(?/i),  P.(2/,),  .  .  . ,  P„,(?/«.)  h  >s,-  D  ^,. 

If  now  we  write  out  in  succession  the  steps  of  these  two  demonstrations 
and  add  a  final  step  Si,  we  shall  have  a  demonstration  of 

Pi,  P„  .  ..,  P„,  Pi(2/i),  P,(2/,),  .  .  . ,  F„Xy.,)  \-  Si. 

If  /Si  is  the  result  of  rule  G,  then  *S,  has  the  form  (x)  Sj  where  j  is  less 
than  i  and  x  does  not  occur  free  in  any  of 

Pi,  P.,  . . . ,  P„,  Pi(z/i),  P2(2/2),  . . . ,  i^«.(z/.,); 

this  restriction  on  the  free  occurrences  of  x  being  exactly  the  condition  that 
was  imposed  in  the  definition  of  j-^.    By  hypothesis 

Pi,  P„  .  .  .,  P„,  Pi(yi),  FXy^),  .  .  .  ,  F^XVa,)  h  Si. 
So 

Pi,  P„  .  .  .,  P„,  Pi(yi),  F,(y2),  .  ..,  F^Xy.)  V  Si. 

Hence  by  the  generalization  theorem  (Thm.VI.4.2),  recalling  that  {x)  Sj 

is  Si, 

Pi,  P„  .  .  .,  P„,  Pi(2/i),  FXy2),  .  ..,  F^Xya.)  h  Si. 

If  *S,-  is  the  result  of  rule  C,  then  Si  is  F^iiya)  by  the  definition  of  a.,  and 
the  demonstration  of 

Pi,  P,,  ...,  P„,  Pi(2/i),  P,(2/2),  .  .  .  ,  P„.(2/..)  h  -S, 
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consists  of  the  single  step  Si,  which  we  justify  by  noting  that  it  is  just 

Lemma  B. 

P„  P„  .  .  .,  P„,  F,(y,),  F,{y,),  ...,  F^(y„)  \-  Q. 

This  follows  from  Lemma  A  by  putting  i  =  s. 

Lemma  C.     For  each  a  with  I  <  a  <  m,  there  is  an  a:  „  such  that 

P„  P„  .  .  .,  P„,  F,(y,),  F,iy,),  .  .  .  ,  F„-i(?/„-x)  h  {Ex^)F^{x„). 

Proof.  To  prove  this,  consider  the  step  F^iya)-  This  is  derived  by 
rule  C  from  some  earlier  Sj  of  the  form  (Ex^)  F„(a;„).  By  Lemma  A,  we 
have 

A,  P„  .  .  .  ,  P„,  F,{y,),  F,{y,),  .  .  .  ,  P„,(i/„,)  \-  (Ea:„)P„(a;„). 

Since  Sj  occurs  before  Fa(ya),  we  have  a,-  <  a  and  so  a,-  <  a  —  1,  and  our 
lemma  follows. 

Lemma  D.     For  0  <  ^3  <  m, 

P„  P„  .  .  .,  P„,  F,{y,),  F,(y,),  .  .  .  ,  P.-,(y_,)  h  Q- 

Proof.  We  prove  this  by  induction  on  ^.  If  /S  =  0,  our  lemma  follows 
by  Lemma  B.  Assume  the  lemma  for  /?  (0  <  jS  <  m),  and  prove  it  for 
/3  +  L    Since  the  lemma  is  true  for  /3,  we  have 

P„  P„  .  .  .,  P„,  F,{y,),  F,{y,),  .  .  .  ,  P„-,(t/„_,)  \-  Q. 

So  by  the  deduction  theorem 

Pu  P2,  .  .  •  ,  Pn,  F^{y,),  F,{y,),  .  .  .  ,  F„-(^.i)(2/„-(;3+i))  [- ^--^(z/^-^)  ^  Q- 

Now  2/m-/3  does  not  occur  free  in  any  of 

Pi,  P2,  .  .  .  ,  Pn,  F,{y,),  F^iy^),  .  .  .  ,  P„_(^+i)(?/„-(^+i)), 

this  restriction  on  the  free  occurrences  of  y^^p  being  exactly  the  condition 
that  was  imposed  in  the  definition  of  \-c.    So  by  the  generalization  theorem 

Pi,    P2,     .    .    .    ,   Pn,    F,{y,),    P2(2/2),    •    .    .    ,    /^m-(^  +  l)(2/m-(^  +  l)) 

By  the  hypothesis  of  our  theorem,  ?/„_p  does  not  occur  free  in  Q.  So  by 
Thm.VI.6.6,  Part  VII, 

Also,  by  Thm.VI.6.8,  Part  II, 

h  (E2/„_,)F„_,(y„_,)  =  (Ea:„_,)P._,(:r„_5). 
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Hence  we  have 

P„  P„  .  .  .  ,  P„,  F,{y,),  F,iy,),  .  .  .  ,  F^^,p,,,{y„-,^,,,) 

However,  by  Lemma  C 

P„  P„  .  .  .  ,  P„,  F,{y,),  F,(y,),  .  .  .  ,  F„_(^,i)(^„_c^,iO 

So  we  conclude 

P„  P„  .  .  .  ,  P„,  F,{y,),  F,{y,),  .  .  .  ,  P„-(,.u(2/.-(^.i,)  h  Q, 

and  our  induction  step  is  complete.  Accordingly,  our  lemma  is  proved  by 
induction. 

Our  theorem  now  follows  by  putting  /3  =  w  in  Lemma  D. 

Because  of  the  above  theorem,  we  are  relatively  free  to  use  rule  C  when- 
ever convenient.  It  is  true  that,  in  our  definition  of  \-c  and  in  our  statement 
of  the  above  theorem,  many  restrictions  on  the  use  of  rule  G  and  rule  C 
are  listed.  Actually  these  correspond  precisely  to  the  inhibitions  that  would 
quite  properly  arise  if  one  were  thinking  of  rule  C  as  being  the  act  of 
"choosing"  a  ?/  to  be  one  of  the  a;'s  which  make  F{x)  true.  Consequently, 
in  any  situation  in  everyday  mathematics  in  which  it  would  be  considered 
legitimate  to  make  such  an  act  of  choice,  we  shall  find  our  restrictions  satis- 
fied and  shall  be  able  to  use  rule  C. 

Let  it  be  recalled  that  rule  C,  together  with  its  attendant  restrictions,  is 
purely  formal  and  depends  entirely  on  the  forms  of  the  statements  involved. 
Thus  our  Thm.VL7.2,  which  we  just  proved,  enables  us  to  replace  the 
psychological  process  of  "choosing"  a  ?/  to  be  one  of  the  a;'s  which  make 
F{x)  true  by  a  purely  formal  process.  This  replacement  constitutes  a 
considerable  logical  advance.  There  are  numerous  connotations  of  the  act 
of  choosing  which  are  disturbing  to  many  mathematicians  who  are  careful 
in  their  reasoning.  This  is  particularly  the  case  when  (Ex)  F{x)  has  been 
proved  by  reductio  ad  absurdum  or  some  indirect  process  which  gives  no 
clues  whatever  as  to  how  one  might  proceed  to  find  one  of  the  x's  which 
make  F{x)  true.  It  is  quite  worth  while  to  dispense  with  these  disturbing 
connotations,  even  if  it  means  replacing  them  by  a  collection  of  somewhat 
arbitrary  rules. 

As  permitted  by  Thm.VI.7.2,  we  shall  make  constant  use  of  rule  G  and 
rule  C  hereafter.  This  will  result  in  a  great  simplification  in  our  proofs,  in 
spite  of  the  complexities  in  the  restrictions  given  in  the  definition  of  \-c  and 
in  the  hypothesis  of  Thm.VI.7.2.  One  reason  for  the  simplification  is  that 
in  general  a  demonstration  of 

Pi>  Pi)  ' '  •  )  P n  rc  Q 
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is  tremendously  shorter  than  the  corresponding  demonstration  of 

Pi,  P2, . . . ,  P„  h  Q- 

Although  we  no  longer  write  out  any  demonstrations  in  full,  but  merely 
give  instructions  for  writing  them  out,  there  is  still  an  advantage  in  dealing 
with  shorter  demonstrations,  since  fewer  instructions  will  be  required. 

Besides  the  fact  that  we  get  shorter  demonstrations  by  use  of  rule  G  and 
rule  C,  there  is  the  important  consideration  that  demonstrations  are  easier 
to  discover  if  we  permit  the  use  of  rule  G  and  rule  C.  This  is  because  rule  G 
and  rule  C  correspond  rather  closely  to  operations  commonly  used  in 
everyday  mathematics.  This  consideration  is  less  important  once  the 
demonstration  has  been  discovered,  but  even  the  person  who  is  merely 
following  demonstrations  written  out  by  someone  else  will  find  them  easier 
to  follow  if  they  embody  familiar  operations,  such  as  rule  G  and  rule  C, 
rather  than  the  very  involved  and  unfamiliar  operations  needed  if  we  do  not 
allow  the  use  of  rule  G  and  rule  C  (see  the  proof  of  Thm.VI.7.2,  for  instance) . 

To  a  considerable  extent,  rule  G  and  rule  C  enable  us  to  deal  with  un- 
quantified  statements  rather  than  with  quantified  ones.  Thus  suppose  we 
have  some  quantified  statements.  By  Thm.VI.5.1,  we  can  remove  the 
universal  quantifiers,  and  by  rule  C,  we  can  remove  the  existential  quanti- 
fiers. We  then  proceed,  unhampered  by  quantifiers.  At  the  end,  we  may 
have  to  put  back  the  quantifiers.  Rule  G  is  available  for  the  purpose  of 
attaching  universal  quantifiers.  For  the  purpose  of  attaching  existential 
quantifiers,  we  have  the  following  theorem : 

^Theorem  VI.7.3.     h  F(y,y)  D  (Ex)  F(x,y). 

Proof.  By  Axiom  scheme  6,  \-  {x)'^F{x,y)  D  ^F{y,y).  So  by 
Thm.VI.6.1,  Part  XLV,  \-  F{y,y)  D  '^{x)^F{x,y).    This  is  our  theorem. 

**Corollary.     VPD  {Ex)  P. 

Proof.     Take  y  to  be  the  same  as  x. 

We  now  give  four  theorems  whose  proofs  follow  the  routine  just  indi- 
cated, namely,  one  first  removes  quantifiers,  then  performs  some  simple 
operations,  and  then  replaces  the  quantifiers. 

Theorem  VI.7.4. 
I.  \-  (Ex).PQ:  D  :(Ex)  p. (Ex)  Q. 
11.  h  (x)  P.w.(x)  Q:  D  :{x).PyQ. 

Proof  of  Part  I.     We  first  undertake  to  prove 

(Ex)  PQ  \-c  (Ex)  P. (Ex)  Q. 

We  start  with  (Ex).PQ.  Then  by  rule  C  with  x,  PQ.  So  by  Thm. 
IV.4.18,  P  and  Q.  So  by  Thm.VI.7.3,  corollary,  (Ex)  P  and  (Ex)  Q. 
Finally  by  Thm.IV.4.22,  (Ex)  P. (Ex)  Q.  Accordingly,  we  can  infer  that 
there  is  a  demonstration  of 

(Ex)  PQ  \-c  (Ex)  P. (Ex)  Q 
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as  soon  as  we  have  shown  that  the  given  restrictions  are  satisfied.  When 
we  referred  to  Thm.IV.4.18,  Thm.VI.7.3,  corollary,  and  Thm.IV.4.22,  we 
skipped  over  large  numbers  of  steps.  Could  any  of  these  steps  involve 
rule  G  or  rule  C?  No,  because  the  theorems  quoted  all  involved  a  |-  and 
not  a  \-o  or  \-c-  So  the  only  use  of  rule  C  is  that  indicated,  by  which  we  went 
from  {¥tx).PQ  to  PQ.  This  use  of  rule  C  is  permissible  since  it  involves  x, 
which  does  not  occur  free  in  (E.r)  PQ.  Also  there  are  no  uses  of  rule  G. 
Accordingly,  our  restrictions  are  satisfied,  and  we  know  that  there  is  a 
demonstration  of 

i^x)  PQ  \-c  i^x)  P.i^x)  Q. 

Moreover,  the  only  use  of  rule  C  in  this  demonstration  is  with  x,  which 
does  not  occur  free  in  (Ex)  P. (Ex)  Q.    So  by  Thm.VI.7.2, 

(Ex)  PQ  h  (Ex)  P.(Ea;)  Q. 

Then  Part  I  follows  by  the  deduction  theorem. 

Proof  of  Part  II.     In  Part  I,  replace  P  and  Q  by  '^P  and  '^Q.    This  gives 

h  (Ex).^P'^Q:  D  :(Ex)  ^PXEx)  ^Q. 
So 

h  ~((Ex)  ^P.(Ex)  ~Q)  D  .~(Ex).~P~Q. 

However,  by  the  duality  theorem, 

h  ^((Ex)  ^P.iEx)  ~Q):  ^  :{x)  P.y.{x)  Q 
h  ~(Ea:).~P~Q:  =  :(x).PyQ. 

So  Part  II  follows. 

In  the  preceding  proof  we  made  quite  a  bother  over  showing  that  we  had 
satisfied  the  various  restrictions  involved  in  the  definition  of  \-c  and  in  the 
hypothesis  of  Thm.VI.7.2.  By  proceeding  in  a  systematic  manner,  we  can 
check  on  these  restrictions  rather  easily.  First  of  all,  each  time  we  use 
rule  C,  we  should  note  the  F ^dj^)  which  results,  and  the  y^  which  we  use. 
Then  at  the  end  of  the  demonstration,  it  is  very  easy  to  check  if  any  of 
these  ?/„'s  occur  free  in  Q.  This  takes  care  of  the  hypothesis  of  Thm.VI.7.2. 
There  still  remain  the  restrictions  embodied  in  the  definition  of  \-c.  Note 
that  there  are  restrictions  only  on  the  use  of  rule  G  and  rule  C,  and  that 
for  any  given  use  of  rule  G  or  rule  C  the  restriction  depends  only  on  the  P's 
and  the  previously  occurring  F„(?/„)'s.  So  one  can  check  the  restrictions 
on  a  given  use  of  rule  G  or  rule  C  at  the  time  the  rule  is  used.  Also,  the 
restrictions  merely  involve  the  question  whether  a  given  variable  occurs 
free  in  the  P's  or  the  previous  P„(?/„)'s,  which  can  be  easily  checked  if  one 
is  keeping  a  list  of  the  P„(?/„)'s. 

Furthermore,  whenever  we  skip  over  a  sequence  of  steps  by  referring  to  a 
previously  proved  theorem  \-  A,  there  is  nothing  to  check,  since  the  steps  of 
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the  demonstration  of  [-  A  involve  no  uses  of  rule  G  or  rule  C.  Also,  since 
there  are  no  uses  of  rule  C,  the  demonstration  of  [-  A  will  not  produce  any 
FaiVa)  that  we  should  add  to  our  hst. 

In  summary,  if  we  compile  a  list  of  the  F„(?/„)  and  y„  as  we  go  along,  it 
is  very  simple  to  check  the  restrictions  involved  in  the  definition  of  \-c, 
since  one  merely  checks  each  use  of  rule  G  or  rule  C  at  the  time  of  its  use 
by  reference  to  the  P's  and  the  previously  occurring  F^iy^Ys.  Also  at  the 
end,  we  have  a  complete  list  of  the  y^'s,  and  can  readily  check  if  any  of 
them  occur  free  in  Q. 

Bearing  this  in  mind,  let  us  now  look  at  the  proofs  of  the  remaining  three 
illustrative  theorems. 

Theorem  VI.7.5.    f-  {Ex)iy)  P  D  (y)(Ex)  P. 

Proof.  Start  with  (Ex)(y)  P.  As  x  does  not  occur  free  in  this,  we  can 
use  rule  C  with  x  and  get  (y)  P.  Then  by  Thm.VI.5.1,  P.  Then  by 
Thm.VI.7.3,  corollary,  (Ex)  P.  As  y  does  not  occur  free  in  {Ex)(y)  P  or 
(y)  P,  we  can  use  rule  G  with  y  and  get  {y){Ex)  P.    So  we  have  shown 

(Ex)(y)P[-c(2/)(Ex)P. 

As  x  does  not  occur  free  in  {y)(Ex)  P,  we  infer 

{Ex){y)PV{y){Ex)P 
by  Thm.VI.7.2. 

Theorem  VI.7.6.     \-  {x).P  D  Q:  D  :(Ex)  P.  D  .{Ex)  Q. 

Proof.  Start  with  (x).P  D  Q  and  (Ex)  P.  By  Thm.VI.5.1,  P  D  Q. 
As  X  does  not  occur  free  in  {x).P  D  Q  or  (Ex)  P,  we  can  use  rule  C  with  x 
to  get  P  from  (Ex)  P.  Then  from  P  and  P  D  Q  by  modus  ponens,  we  get 
Q.    Finally,  by  Thm.VI.7.3,  corollary,  (Ex)  Q.    So 

{x).P  D  Q,  (Ex)  P  \-c  (Ex)  Q. 

As  X  does  not  occur  free  in  (Ex)  Q,  we  get 

{x).P  D  Q,  (Ex)  P  h  (E.T)  Q. 

Then  our  theorem  follows  by  two  uses  of  the  deduction  theorem. 

Theorem  VI.7.7.     \-  (Ex).P  D  Q:  D  :{x)  P.  D  .(Ex)  Q. 

Proof.  Start  with  (Ex).P  D  Q  and  (x)  P.  By  rule  C  with  x,  P  D  Q,  and 
by  Thm.VI.5.1,  P.  So  by  modus  ponens,  Q.  Finally  by  Thm.VI.7.3, 
corollary,  (Ex)  Q.    So 

(Ex).P  D  Q,  (x)  P  [-C  (Ex)  Q. 
So 

(Ex).P  D  Q,  (x)  P  h  (Ex)  Q. 

It  will  be  instructive  to  go  back  and  review  Bocher's  proof  that  the  sum 
of  two  continuous  functions  is  continuous;  in  particular  let  us  rewrite  it  as 
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a  formal  proof  involving  rule  G  and  rule  C,  with  appropriate  use  of 
Thm.VI.7.2.  As  we  shall  now  add,  in  full  detail,  all  the  logical  steps  which 
Bocher  takes  for  granted,  our  formalized  version  will  be  much  longer  than 
Bocher's  original.  We  mention  this  in  order  to  point  out  that  the  extra 
length  is  due  to  the  insertion  of  omitted  steps,  rather  than  to  the  use  of 
formal  logic  instead  of  intuitive  logic. 
Let  Ai{i  =  1,  2)  denote 

(e):.e  >  0.  D  :(E5):5  >  0:(a:):|  x  -  c\  <  b.D  .\  f,(x)  -  /.(c)  |  <  e; 

that  is,  Ai  denotes  the  statement  "/»•  is  continuous  at  the  point  c."    Then 
the  theorem  we  are  to  prove  is: 

.M1&A2.   D  ::(£):.£  >  0.   D  :(E5):5  >  0:(a:):I  x  -  c  \   <  d.  D  . 

I  f,{x)  +  Mx)  -  f,(c)  -  hie)  I   <  e. 

Because  of  the  generalization  theorem  (Thm.VI.4.2)  and  the  deduction 
theorem,  it  will  suffice  to  prove 

A,&A2  \-e>  0.  D  :(E8):8  >  0:(x):\  x  -  c  \   <  8.  J  . 

I  Ux)  +  Ux)  -  f,(c)  -  /.(c)  I   <  e. 

Because  of  the  deduction  theorem,  it  will  suffice  to  prove 

Ai&^2,  e  >  0  h  (E8):8  >  0:(x):\  x  -  C  \    <   8.   D   . 

I  fi(x)  +  Mx)  -  f,(c)  -  f,{c)  I  <  6. 

So  let  us  start  with 

(1)  A^&A^     ■ 

(2)  £  >  0. 

We  are  now  at  the  point  corresponding  to  Bocher's  clause^  "Then,  no 
matter  how  small  the  positive  quantity  s  may  be  chosen,"  but  must  insert 
several  steps  before  we  can  proceed  with  the  rest  of  his  sentence'  "we  may 
take  5i  and  ^2  .  .  .  ".    By  Axiom  scheme  6,  we  have  for  i  =  1,2, 

M...  D  :.e/2  >  0.  D  :(E5):5  >  0:(a:):|  x  -  c\   <  8.^  . 

I  /,(x)  -  /,(c)  I  <  e/2. 

However,  from  our  assumption  (1),  we  get  A^  and  A^,  and  so  by  modus 
ponens,  we  get  for  i  =  1,  2, 

e/2  >  0.  D  :(E5):5  >  0:(a;):|  .T  -  c  ]   <  5.  D  .j  /,(.r)  -  /.(c)  ]  <  z'2. 
1  From  Bocher,  op.  cit. 
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From  our  assumption  (2),  we  get  e/2  >  0  (this  is  one  of  the  few  steps  in 
this  proof  which  is  not  purely  logical),  and  so  by  modus  ponens,  we  get 
for  i  ^  1,  2, 

(E5):5  >  Q;{x),\  X  -  c\  <  b.D  .\  ji{x)  -  jiic)  I  <  e/2. 
Now  by  rule  C,  we  get 

(3)  h  >  0:(x):|  X  -  c\   <   h.O  .\  fi(x)  -  f,{c)  I   <  e/2. 
By  rule  C  again,  we  get 

(4)  52  >  0:(x):|  X  -  c\  <  8,.  D  .\  f,(x)  -  f,(c)  I  <  e/2. 

This  brings  us  to  the  end  of  that  sentence  in  Bocher's  proof. 
Now  take  8  the  smaller  of  5i  and  82  (the  logic  involved  in  this  step  will  be 
discussed  in  Chapter  VIII).    Then  we  have 

(5)  8  >  0, 

(6)  8  <  8„ 

(7)  8  <  82. 
By  (6)  and  (7),  we  get 

(8)  \  X  —  c  \  <  8.  D  .\  X  —  c  \  <  Si, 

(9)  I  a;  -  c  I  <  5.  D  .|  a:  -  c  I   <  52. 

(This  is  another  step  which  is  not  purely  logical.)     By   (3),   (4),  and 
Axiom  scheme  6,  we  get 

I  X  -  c  I   <  5i.  D  .1  fi{x)  -  flic)  I  <  e/2, 

\x  -  c\   <  82.  :>  .\  f2(x)  -  f^ic)  I   <  e/2. 

By  (8)  and  (9),  we  get 

\x  -  c\  <  8.D  .\fiix)  -  fi(c)  I  <  e/2, 
\x  -  c\  <  8.D  .\  f^ix)  -  J2{c)  I   <  e/2. 

By  Thm.IV.4.17, 

(10)  I  X  -  c  I  <  5.  D  :!  ji{x)  -  flic)  I  <  e/2.|  f^ix)  -  f^ic)  \  <  e/2. 

However, 

(11)  1  fiix)  -  flic)  I   <  £/2.|  f,ix)  -  f^ic)  1  <  e/2:  D  : 

I  frix)  +  f^ix)  -  flic)  -  fAc)  1   <  e 
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(This  is  our  third  step  which  is  not  purely  logical.  Bocher  makes  two  steps 
out  of  it,  since  he  is  more  interested  in  the  mathematics  of  the  proof  than 
in  the  logic.)    By  (10)  and  (11), 

\x-  c\  <  h.D  .\  f,(x)  +  f,ix)  -  f,(c)  -  f,(c)  I  <  e. 

As  X  does  not  occur  free  in  either  (1),  (2),  (3),  or  (4),  we  can  use  rule  G 
and  get 

ix):\  X-  c\  <  8,D  .\  f,(x)  +  /.(x)  -  /i(c)  -  /.(c)  I  <  s. 

By  (5),  we  get 

5  >  0:(a:):|  X  -  c\  <  8.  D  .\  f,(x)  +  f^ix)  -  f,(c)  -  f^ic)  \  <  e. 

Finally,  by  Thm.VI.7.3, 

(E8):8  >  0:(x):\  x  -  c  \  <  8.  D  .\  f,{x)  +  f,(x)  -  f,(c)  -  f,(c)  \  <  e. 

So  we  have  proved 

Ai&^2,  £  >  0  he  (E8):8  >  0:(x):\  x  -  c  \   <   8.   D  . 

I  f^(x)  +  .U(x)  -  fr(c)  -  ,U(c)  I  <  e. 

As  neither  8i  nor  ^2  occurs  at  all  in  the  conclusion,  we  can  use  Thm. 
VI. 7.2  and  replace  \-c  by  |-.  Then,  as  indicated  earlier,  we  can  conclude  the 
proof  by  using  the  deduction  theorem  and  the  generalization  theorem. 

EXERCISES 
VI.7.1.     Prove: 

(a)  \-(x)  P  D  (Ex)  P. 

(b)  h  (Ex)  F(x).  =  .(Ex,y).F(x)yF(y). 

(c)  h  (Ex)  P.(x)  Q.  D  .(Ex).PQ. 

(d)  h  (x,y).P  3  Q:  3  :(x,y)  P.  D  .(x,y)  Q. 

(e)  h  (x)(Ey).P  D  Q:  D  :(Ex)(y)  P.  D  XEx,y)  Q. 

(f)  h  (x).PyQ:  D  :(x)  P.y.iEx)  Q. 

(g)  h  (x)  PMEx)  Q:   D   :(Ex).PyQ. 
(h)  h  (Ex)  p.  D  .(x)  Q:  D  :(x).P  D  Q. 

(i)      h  (Ea;)  P.  D  .(Ex)  Q:  D  :(Ex).P  D  Q. 
(j)     h  (x)  P.  3  .(x)  Q:  3  :(Ex).P  D  Q. 

VI.7.2.  If  there  are  no  free  occurrences  of  ^  in  P  and  no  free  occurrences 
of  X  in  Q,  prove : 

(a)  h  (x)  P.(y)  Q:  -  :(x,y).PQ. 

(b)  h  (x)(Ey).PQ:  ^  :(Ey){x).PQ. 

(c)  h  (y)  Q.  D  .(Ex)  P:  ^  :(Ex,y).Q  D  P. 
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VI.7.3.     Find  the  flaws  in  the  following  invalid  proofs  of  incorrect 
statements : 

(a)  To  prove  (Ea;)  P  \-  {x)  P.  Start  with  (Ex)  P.  Then  P  by  rule  C. 
Then  {x)  P  by  rule  G.    So  (Ea;)  P  he  {x)  P.    So  (E.r)  P  \-  {x)  P. 

(b)  To  prove  (Ea;)  P  [-  P.  Start  with  (Ex)  P.  Then  P  by  rule  C.  So 
(Ex)  P  he  P.    So  (Ex)  P  h  ^. 

(c)  To  prove  h  (Ex)  P  D  P.  We  use  reductio  ad  absurdum.  So  start 
with  (Ex)  P.^P.  Then  (Ex)  P  and  ~P.  So  by  rule  C,  P  and  ~P. 
So  P'^P.  So  (x)  Q.~(x)  Q  by  Thm.VI.6.1,  Part  LXV.  So  (Ex)  P. 
~P  he  (a:)  Q.^ix)  Q.  So  (Ex)  P  ~P  h  (x)  Q.~(x)  Q.  So  h  (Ex)  P. 
-'P:  D  :(x)  Q.-^(x)  Q.  So  h  ~((Ex)  P.'^P)  by  Ex.  II.3.1,  Part  (f). 
This  is  h  (Ex)  P  D  P. 

(d)  Toprove(Ex)P.(Ex)Qh(Ea:)PQ.  Start  with  (Ex)  P. (Ex)  Q.  Then 
(Ex)  P  and  (Ex)  Q.  So  by  rule  C,  P  and  Q.  So  PQ.  So  (Ex)  PQ 
by  Thm.VI.7.3.  So  (Ex)  P. (Ex)  Q  he  (Ex)  PQ.  So  (Ex)  P. (Ex)  Q  h 
(Ex)  PQ. 

(e)  To  prove  {y)(Ex)  P  \-  (Ex)(y)  P.     Start  with  (2/)  (Ex)  P.     Then 
(Ex)  P  by  Thm.VI.5.1.    Then  P  by  rule  C.    Then  (y)  P  by  rule  G. 
Then  (Ex)(y)  P  by  Thm.VI.7.3.     So  (y)(Ex)  P  he  ('Ex)(y)  P.    So  ' 
(z/)(Ex)Ph(Ex)(2/)P. 

(f)  To  prove  (x).P  D  Q,  (Ex)  P  h  (^)  Q-  Start  with  (x).P  D  Q  and 
(Ex)  P.  Then  P  D  Q  by  Thm.VI.5.1  and  P  by  rule  C.  So  Q  by 
modus  ponens.  So  (x)  Q  by  rule  G.  So  (x).P  D  Q,  (Ex)  P  he  (x)  Q. 
So  (x).P  D  Q,  (Ex)  P  h  (a:)  Q. 

(g)  To  prove  (Ex).P  D  Q,  (Ex)  P  h  (Ex)  Q.  Start  with  (Ex).P  D  Q  and 
(Ex)  P.  Then  P  D  Q  and  P  by  rule  C.  So  Q  by  modus  ponens.  So 
(Ex)  Q  by  Thm.VI.7.3.  So  (Ex).P  D  Q,  (Ex)  P  he  (Ex)  Q.  So 
(Ex).P  D  Q,  (Ex)  P  h  (Ex)  Q. 

(h)  To  prove  (x).P  D  Q,  (Ex)  P  h  Q-  Start  with  (x).P  D  Q  and  (Ex)  P. 
Then  P  D  Q  by  Thm.VI.5.1  and  P  by  rule  C.  So  Q  by  modus  ponens. 
So  (x).P  D  Q,  (Ex)  P  he  Q.    So  (x).P  3  Q,  (Ex)  P  h  Q. 

8.  Restricted  Quantification.  The  expression  (x)  F(x)  indicates  that 
P(x)  is  true  where  x  is  any  logical  entity  whatsoever.  In  practical  mathe- 
matical discussions  this  amount  of  generality  is  far  more  than  is  desirable 
or  useful.  For  instance,  instead  of  "For  all  x,  A(x)"  or  "There  is  an  x  such 
that  A(x),"  one  will  usually  find  in  mathematical  discussions  such  expres- 
sions as  "For  all  positive  integers,  x,  ^(x),"  or  "There  is  a  prime,  p  ^  I 
(mod  4),  such  that  A  (p),"  or  "For  each  e  >  0,  ^(e),"  etc.  Even  when  such 
expressions  as  "For  all  x,  A(x),"  or  "There  is  an  x  such  that  A(x),"  do 
occur  in  mathematical  discussions,  it  is  almost  always  understood  tacitly 
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that  there  are  certain  restrictions  on  the  x,  and  that  what  is  meant  is 
something  hke  "For  all  real  numbers,  x,  A{x),''  or  "There  is  an  angle,  x, 
such  that  A{x)." 

In  short,  the  types  of  quantifiers  which  are  useful  in  mathematics  take 
the  general  forms  "For  all  x  of  kind  K,  A{xy'  or  "There  is  an  x  of  kind  K 
such  that  A{x).''    We  refer  to  these  as  restricted  quantifiers. 

It  is  very  easy  to  express  restricted  quantifiers  in  symbolic  logic.  Let 
K{x)  and  F{x)  be  translations  of  "x  is  of  kind  K"  and  ''A{x)."  Then 
{x).K{x)  D  F{x)  and  (Ea;)  K{x)F(x)  are  the  translations  of  "For  all  x  of 
kind  K,  A(xy'  and  "There  is  an  x  of  kind  K  such  that  A(x)."  Thus  the 
translation  of  restricted  quantifiers  presents  no  problem.  However,  there 
is  another  question  besides  mere  translation,  namely,  a  question  of  con- 
venience. In  ordinary  mathematics  the  device  of  restricting  certain  letters 
to  denote  values  from  specified  ranges  saves  a  large  amount  of  repetition 
of  hypotheses  and  reiteration  of  restrictive  conditions.  Thus  one  may  find 
it  stated  in  the  beginning  of  some  text  that,  throughout  the  text,  x  and  y 
shall  denote  real  numbers,  5  and  e  shall  denote  positive  real  numbers,  m  and 
n  shall  denote  positive  integers,  etc.  Then  such  conditions  need  not  be 
inserted  in  the  statements  of  theorems,  or  in  the  proofs,  or  in  definitions, 
or  in  discussions.  The  resulting  abridgment  not  only  saves  space  but 
facilitates  comprehension. 

One  can  partially  adopt  such  conventions  in  symbolic  logic,  and  it  is  quite 
worth  while  to  do  so.  The  procedure  is  not  difficult.  If  there  is  some 
condition,  K(x),  which  we  wish  to  impose  on  certain  quantities  throughout 
a  given  discussion,  we  choose  certain  letters  which  are  to  denote  quantities 
satisfying  the  condition  K(x)  throughout  our  discussion.  Suppose  we 
choose  a  and  /3  for  this  purpose.  Then  throughout  our  discussion  we  under- 
stand (a)  F  (a)  to  be  an  abbreviation  for  {x).K{x)  D  F{x),  and  we  under- 
stand (Ea)  F{a)  to  be  an  abbreviation  for  (Ex)  K{x)F{x).  Similarly  for  /3. 
In  such  case  we  speak  of  (a)  and  (Ea)  as  restricted  quantifiers. 

One  can  at  the  same  time  have  other  letters  denoting  quite  different 
quantities.  Thus,  at  the  same  time  that  we  interpret  {a),  {$),  (Ea),  and 
(E/3)  as  indicated  above,  we  can  agree  that  (t)  F{y)  shall  denote 
{x).L{x)  ^  F{x)  and  (E7)  ^^(7)  shall  denote  (Ea:)  L(x)F(x).  And  we  can 
at  the  same  time  let  (5)  F(8)  or  (s)  F(e)  denote  (x).x  >  0  D  F(x),  and  let 
(E5)  F(8)  or  (Es)  F(e)  denote  (Ex):x  >  O.F(x).  One  can  mix  the  various 
kinds  of  quantifiers.    Thus 

(a,y)  F{a,y) 

would  denote 

(x):K(x).  D  .(y)My)  3  F(x,y), 
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or  any  of  the  equivalent  forms 

{x,y):K{x).  D  Mv)  ^^  F(x,y), 
(x,y):K(x)L{y).  D  .F{x,y), 
(y,x):L(y)K{x).  D  .F{x,y), 
(y,x):L(y).  D  .K{x)  D  F{x,y), 
(y):L(y).  D  .{x).Kix)  3  F{x,y). 

The  last  of  these  is  just  what  we  would  denote  by  (7,0:)  F(a,y).  So  we 
have 

h  («,7)  P  ^   (t,«)  P 

even  when  a  and  7  are  restricted  quantifiers,  and  even  when  we  do  not 
have  the  same  restriction  on  each  of  a  and  7. 

However,  even  with  different  restrictions  on  a  and  y,  we  cannot  infer  the 
equivalence  of  (Ea)  (7)  P  and  (7)  (Ea)  P. 

We  notice  that  by  Thm.VI.6.8  it  is  quite  immaterial  whether  we  interpret 
(a)  F{a)  as  (x).K{x)  D  F(x)  or  (y).K(y)  D  F(y),  provided  that  we  observe 
the  restrictions  implied  by  our  conventions  in  writing  F{a),  F{x),  F(y), 
K{a),  K{x),  K{y),  namely,  that  F{x)  is  the  result  of  replacing  all  free  occur- 
rences of  a  in  F{a)  by  occurrences  of  x  and  F{a)  is  the  result  of  replacing 
all  free  occurrences  of  x  in  F{x)  by  occurrences  of  a,  with  similar  under- 
standings for  K{a)  and  K{x),  F(x)  and  F(y),  and  K(x)  and  K(y).  Such  an 
understanding  is  necessary  to  assure  that  (x).K{x)  D  F(x)  has  the  meaning 
intended  for  (a)  F{a). 

The  above  conventions  ensure  that,  if  we  have  (a)  F(a)  where  F(a)  con- 
tains no  free  occurrences  of  a,  then  we  must  choose  an  x  which  does  not 
occur  free  in  F{a)  when  we  write  (x).K(x)  D  F{x)  as  the  interpretation  of 
(a)  F(a).  Then  x  will  not  occur  free  in  F{x),  and  F(a)  and  F{x)  are  the 
same. 

We  now  state  the  remarkable  fact  that,  if  we  confine  our  attention  to 
formulas  with  no  free  occurrences  of  the  quantified  variable,  then  all  the 
theorems  which  we  have  proved  for  unrestricted  quantifiers  hold  also  for 
restricted  quantifiers,  except  that  a  few  of  them  require  the  hypotheses 
(Ea:)  K{x),  (Ex)  L(x),  (Ex).x  >  0,  etc.  This  requirement  means  that,  if  we 
are  going  to  deal  with  restricted  quantifiers,  our  restriction  should  not  be 
so  severe  that  it  is  not  satisfied  by  any  quantities  at  all.  In  practical  cases, 
the  restrictions  K{x),  L{x),  etc.,  will  usually  be  conditions  such  as  "x  is  a 
real  number,"  or  "x  is  a  vector,"  or  the  like,  and  we  shall  certainly  have 
(Ex)  K{x),  (Ex)  L(x),  etc. 
The  easiest  way  to  verify  that  the  theorems  with  no  free  occurrences  of 
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the  quantified  variables,  which  we  have  proved,  hold  also  for  restricted 
quantifiers  is  to  check  them  one  by  one.  Actually,  in  most  cases  all  that  is 
required  is  to  write  the  interpretation  of  the  given  theorem,  and  it  is  then  a 
simple  exercise  to  prove  it. 

We  should  consider  first  our  axioms.  Suppose  {x)  F{x)  is  an  axiom;  can 
we  prove  (a)  7^(a)?  This  latter  signifies  {x).K{x)  D  F{x).  Since  {x)  F{x) 
is  an  axiom,  we  have  \-  {x)  F(x).  Then  |-  F(x)  by  Thm.VI.5.1.  However, 
by  Thm.IV.4.28,  corollary,  \- F{x).  D  .K{x)  D  F{x).  So  \- K(x)  D  F{x). 
Then  by  Thm.VI.4.1,  I  {x).K{x)  D  F(x).    That  is,  \-  (a)  F{a). 

This  takes  care  of  all  axioms  of  the  form  (x)  F(x). 

Now  consider  (x).F{x)  D  G(x):  D  :(x)  F{x).  D  .(.r)  G(x),  one  of  the  in- 
stances of  Axiom  scheme  4. 

Since  this  does  not  have  the  form  (x)  F(x),  it  is  not  covered  by  our  earlier 
analysis.    We  must  prove 

h  (a).Fia)   D  G{a):   D  :{a)  F(a).   D   .(a)  G{a). 

This  signifies 

h  (x):K(x)  D  .F(x)  D  G(x).:  D  :.(a;).K(a;)  D  F(x):  D  :{x).K{x)  D  G{x). 

We  prove  without  diflficulty 

{x):K{x)  D  .F{x)  D  G(x),  {x).K{x)  D  F{x),  K(x)  \-G{x). 
Then  by  the  deduction  theorem  and  the  generalization  theorem,  we  get 

{x):K(x)  D  .F{x)  D  G{x),  (x).K{x)  D  F(x)  [-  (.r).Z(x)  D  G(x). 

Then  the  desired  result  follows  by  two  more  uses  of  the  deduction  theorem. 

Now  consider  P  D  (x)  P,  where  there  are  no  free  occurrences  of  x  in  P. 
This  is  an  instance  of  Axiom  scheme  5.  We  wish  to  prove  \-  P  0  (a.)  P, 
where  a  does  not  occur  free  in  P.  This  signifies  \-  P  0  .{x).K{x)  D  P, 
where  x  does  not  occur  free  in  P.  By  Thm.IV.4.28,  corollary,  we  have 
\-P  D  .K(x)  D  P.  So  by  Thm.VI.4.1,  \-  (x):P  D  .K{x)  D  P.  So  by  Thm. 
VI. 6.6,  Part  IX,  [-  P  ^  Xx).K{x)  D  P. 

Axiom  scheme  6  can  involve  free  occurrences  of  the  quantified  variable, 
and  so  we  are  not  concerned  with  it  at  the  moment.  Likewise  Thm.VI.4.1, 
Thm.VI.4.2,  and  Thm.VI.5.1.  We  postpone  Thm.VI.5.2  because  it  is  a 
special  case  of  Thm. VI. 6. 5,  which  we  shall  take  up  in  its  place. 

Now  consider  Thm. VI. 5. 3.    We  wish  to  prove 

y  (a).F(a)   ^  G(a):   D  :(«)  F{a).  ^  .(a)  G(a). 

This  signifies 
\-  (x):K{x)  D  .F(x)  =  G(x).:  D  :.(x).K(x)  D  F(x):  =  :(x).K(x)  D  G(x). 
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This  is  easily  proved,  since  we  can  prove  each  of 

h  {x):K{x)  D  .F{x)  ^  Gix).:  D  ■..(x).K(x)  D  F(x):  D  :ix).K{x)  D  G{x) 
and 

\-  (x):K(x)  D  .F(x)  =  G(x).:  D  :.(x).K(x)  D  G(x):  D  :{x).K(x)  D  F{x) 

by  the  procedure  used  for  Axiom  scheme  4. 

In  case  some  of  the  y's  are  restricted  quantifiers  which  occur  free  in 
W  =  V,  Thm.VI.5.4  will  not  hold.  However,  Thm.VI.5.5  and  Thm.VI.5.6 
hold  in  any  case.  The  method  of  proof  is  very  easy,  and  we  illustrate  with 
an  example.    Suppose  we  have  |-  F{a)  =  G(a),  and  wish  to  prove 

\-P  D   (a)  F{a).  ^  .P  D   (a)  G(a). 

The  latter  signifies 

^P  D  .{x).K(x)  D  F{x):  =  :P  D  .(x).K(x)  D  G(x). 

By  Thm.VI.6.8,  this  is  equivalent  to 

\-P  D   .{a).K(a)   D  F{a):  =  :P  D   .(a)./^(a)   D  G(a), 

in  which,  momentarily  for  the  purposes  of  a  proof,  we  are  not  considering 
the  ct  as  a  restricted  quantifier.  Now  this  latter  follows  from  |-  F(a)  =  G(a) 
by  Thm.VI.5.5. 

Thm.VI.6.2  is  a  special  case  of  the  duality  theorem,  and  so  we  proceed 
directly  to  the  duality  theorem,  Thm.VI.6.3.  We  first  have  to  generalize 
the  definition  of  a  dual.  In  addition  to  replacing  (x)  by  (Ex)  and  vice  versa, 
we  replace  (a)  by  (Ea)  and  vice  versa,  and  similarly  for  any  other  letter 
denoting  a  restricted  quantity.  In  a  word,  we  treat  restricted  quantifiers 
just  like  unrestricted  quantifiers  when  forming  the  dual.  To  prove  the 
duality  theorem,  we  show  that,  if  we  write  out  (a)  F(a)  without  restricted 
quantifiers  and  take  the  dual  in  the  usual  manner,  we  merely  get  what 
(Ea)  F*(a)  signifies,  and  vice  versa  (where  F*(a)  is  the  dual  of  F{a)).  If  we 
write  (a)  F(a)  with  an  unrestricted  quantifier,  we  get  (x).K(x)  D  F(x), 
which  is  equivalent  to  (x).^K(x)wF(x),  whose  dual  is  (Ex).K(x).F*(x), 
which  is  just  (Ea)  F*(a)  written  with  an  unrestricted  quantifier.  Similarly, 
we  proceed  from  (Ea)  F*{a)  back  to  (a)  F(a)  if  we  take  the  dual  of  the 
corresponding  statement  written  with  an  unrestricted  quantifier.  So  the 
duality  theorem  is  easily  verified. 

Likewise  the  important  corollary  to  the  duality  theorem,  namety, 
Thm.VI.6.4,  also  holds,  it  being  proved  exactly  as  when  we  were  dealing 
only  with  unrestricted  quantification.  By  means  of  it,  we  can  get  Part  II 
of  Thm.VI.6.5  from  Part  I;  Parts  II,  IV,  and  V  of  Thm.VI.6.6  from 
Parts  I,  III,  and  VI;  Part  II  of  Thm.VI.6.7  from  Part  I;  and  Part  II  of 
Thm.VI.6.8  from  Part  I. 
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Now  let  us  look  at  Part  I  of  Thm.VI.6.5.    We  desire  to  prove 

y  {a).F{a).G{a):  ^  :(a)  F(a).(a)  G(a). 
This  signifies 

h  (x):K{x)  D  .F{x).G{x).:  =  •..{x).K{x)  D  F{x):{x).K{x)  D  G{x). 
One  easily  proves 

{x):K{x)  D  .F{x).G{x),  K{x)  \-  F{x) 
and  so  gets 

\-  (x):K{x)  D  .F(x).G{x).:  D  :.ix).K(x)  D  Fix). 

Similarly 

\-  (x):K(x)  D  .F(x).G(x).:  D  :.(x).K(x)  D  Gix), 

and  we  have  half  of  our  theorem.    Conversely,  we  easily  get 

(x).K(x)  D  Fix):(x).K{x)  D  G{x),K{x)  \-F{x).G{x), 

which  gives  the  other  half  of  our  theorem. 

Now  let  us  look  at  Part  I  of  Thm.VI.6.6.  We  desire  to  prove 
\-  {a)  Q  =  Q,  where  there  are  no  free  occurrences  of  a  in  Q.  This  signifies 
\-  {x).K{x)  D  Q:  =  Q,  where  x  does  not  occur  free  in  Q.  Since  we  are 
assuming  |-  (Ex)  K(x),  we  have  by  Thm.VI.6.1,  Part  LXIX,  [-  Q  =  : 
(Ex)  K{x).  D  .Q.    So  by  Thm.VI.6.6,  Part  VII,  \- Q  ^  :(x).K{x)  D  Q. 

We  now  get  Part  III  of  Thm.VI.6.6  by  the  same  proof  that  was  used 
when  dealing  with  unrestricted  quantifiers. 

We  now  consider  Part  VI  of  Thm.VI.6.6.  We  desire  to  prove  \-  (Ea): 
F{a).Q.:  =  :.(Ea).F(a):Q.  This  signifies  \-.  (Ex):K{x).Fix).Q.:  =  :.(Ex). 
K(x).F(x):Q.  This  is  an  easy  consequence  of  Thm.VI.6.6,  Part  VI,  if  we 
take  P  to  be  K{x)F{x). 

We  now  quickly  conclude  the  rest  of  Thm.VI.6.6,  because  we  get  Part 
VII  by  putting  ^P  for  P  in  Part  V,  Part  VIII  by  putting  -^F  for  P  in 
Part  IV,  Part  IX  by  putting  '-^Q  for  Q  in  Part  V,  and  Part  X  by  putting 
~Q  for  Q  in  Part  IV. 

We  have  already  indicated  how  to  prove  Part  I  of  Thm.VI.6.7. 

Now  consider  Part  I  of  Thm.VI.6.8.  We  wish  to  prove  |-  {a)  F{a)  = 
(/3)/^(/3). 

CAUTION.     Unless  a  and  /3  have  the  same  restrictions,  this  is  not  true. 

As  we  are  assuming  a  and  /3  both  subject  to  the  restriction  K(x),  this 
signifies  |-  (x).K(x)  D  F(x):  =  :(y).K(y)  D  F(y),  which  is  immediate. 

Had  we  tried  to  prove  |-  {a)  F(a)  =  (7)  ^(7),  we  should  have  been  trj-ing 
to  prove  |-  (x).K{x)  D  F{x):  =  :(y).L(y)  3  F(y),  which  in  general  is  not 
true  unless  one  assumes  some  special  relationships  between  F{x),  K(x),  and 
L(x). 


146  LOGIC  FOR  MATHEMATICIANS  [Chap.  VI 

Part  I  of  Thm.VI.7.4  would  read 

\-  (Ea).F(a).G{cx):   3   :(Ea)  F(a):(Ea)  G(a), 
signifying 

h  (Ex).K(x).F{x).G(x):  D  :(Ex).K(x).F(x):(Ex).Kix).G(x). 

This  is  easily  proved  since 

[-  K(x).F{x).G(x):  ^  :K(x).F(x):K(x).Gix). 

From  Part  I  one  can  get  Part  II  in  the  same  manner  as  for  unrestricted 
quantifiers. 

Thm.VI.7.5  would  read  [-  (Ea)(y)  F(a,y).  D  .(7)(Ea)  F(a,y),  signifying 

h  (Ex):Kix):{y)My)  3  F(x,y).:  D  :.(y)-My)  :>  .(Ex).K(x).F{x,y). 

One  proves  in  succession 

(Ex):K(x):iy)My)  ^  F(x,y),  L{y)  he  {Ex).K{x).F{x,y), 

(Ex):K{x):(y)My)  '^  F(x,y),  L(y)  \-  {Ex).K{x).F{x,y), 

iEx):Kix):(y)My)  ^  F{x,y)  [-  {y):L{y)  D  .{Ex),K{x).F{x,y). 

Thm.VI.7.6  would  read 

h  {a).F{a)  D  G{a):  D  :(Ea)  Fia).  D  .(Ea)  G(a), 

signifying 

\-  ix):K(x)    D    .F(x)    D    Gix).:    D    :.(Ex).K{x).F{x):    D    :(Ex).K(x).G{x). 

This  is  easily  proved,  following  the  pattern  of  proof  of  the  original  Thm. 
VI.7.6. 

Similarly  for  Thm. VI. 7.7. 

We  now  raise  the  question  of  the  significance  of  F(a),  in  which  the  occur- 
rences of  a  are  free.  In  deciding  to  take  {x).K(x)  D  F(x)  and  (Ex).K(x). 
F{x)  as  the  meanings  of  {a)  F{a)  and  (Ea)  F{a),  we  were  guided  by  the 
intuitive  meanings.  In  the  case  of  F{a),  the  intuitive  meaning  does  not 
furnish  a  satisfactory  guide.  In  everyday  mathematics,  if  it  has  been 
agreed  that  oc  stands  for  a  quantity  satisfying  the  restriction  K{a),  it  is 
commonly  the  case  that,  if  one  is  assuming  F{a),  then  K{a)&F{a)  is  under- 
stood, but  if  one  is  trying  to  prove  F{a),  then  K{a)  D  F{a)  is  understood. 
It  seems  that  in  symbolic  logic  perhaps  it  is  best  not  to  give  any  especial 
significance  to  the  a  in  F{a)  when  it  occurs  free.  This  does  not  cause  any 
confusion,  because  it  has  the  effect  of  associating  the  restriction  with  the 
quantifiers  (a)  and  (Ecu),  rather  than  merely  with  the  letter  a.  As  the 
restriction  is  associated  merely  with  the  letter  in  everyday  mathematics,  we 
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are  using  a  different  system  from  that  used  in  everyday  mathematics,  and 
the  reader  who  is  accustomed  to  the  procedures  of  everyday  mathematics 
should  note  carefully  the  difference  in  treatment. 

Accordingly,  if  a  variable  occurs  both  free  and  bound  in  a  statement,  the 
free  occurrences  are  subject  to  no  restriction,  whereas  the  bound  occurrences 
may  be  restricted.  One  can  always  deal  with  this  situation  by  giving  the 
restricted  quantifiers  their  actual  significance.  Thus,  by  Thm.VI.5.1,  we 
have 

\-  {x).K{x)  D  F{x):  D  :K{x)  D  F(x). 
Using  restricted  quantification,  we  can  write  this  as 

\-  (a)  F(a).  D  .K{x)  D  F{x). 

This  is  then  the  form  which  Axiom  scheme  6  takes  with  restricted  quanti- 
fication. A  moment's  thought  will  indicate  that  this  is  the  only  form  it 
could  take.  Since  (a)  F{a)  means  that  F{a)  is  true  for  all  a  satisfying 
K{a),  one  could  not  expect  to  infer  F{x)  without  the  prior  hypothesis  K{x). 
A  similar  situation  holds  with  respect  to  Thm.VI.7.3.  By  the  corollary 
to  this  we  have 

yK{x).F{x):  D  ■.{Y.x).K{x).F{x). 
That  is, 

\-K{x).F{x):  D  :(Ea)  F{a). 

This  is  as  close  to  Thm.VI.7.3  as  one  can  come  with  restricted  quantifica- 
tion, and  it  is  as  close  as  one  would  expect  to  come. 

If  one  applies  rule  C  to  (Ea)  F{a),  then,  since  this  is  really  (Ex).i^(a:). 
F{x),  one  gets  not  merely  F{y),  but  K(y).F(y);  again  just  what  one  would 
expect. 

In  connection  with  rule  G,  the  situation  is  more  difficult.  If  one  has 
proved  \-  F(a),  then  one  easily  gets  |-  K(a)  D  F(a),  and  so  [-  (a)  F(a)  by 
Thm.VI.4.1.  However,  the  more  usual  situation  is  that  in  which  one  has 
Pi,  P2,  .  .  .  ,  Pn,  K(a)  \-c  F(a)  and  wishes  to  progress  to  Pj,  P2,  .  .  .  ,  P„  \-c 
(a)  F(a) ;  naturally  one  can  hope  to  do  this  only  in  case  a  satisfies  the  various 
restrictions  imposed  on  a  use  of  rule  G.  If  we  were  dealing  with  |-  instead 
of  \-c,  the  matter  would  be  simple.  First  we  proceed  to  Pi,  P2,  .  .  .  ,  P„  [- 
K(a)  D  F(a)  by  the  deduction  theorem,  and  then  Thm.VI.4.2  gives 
Pi,  P2,  .  .  .  ,  P„  |-  (a)  F(a).  As  a  matter  of  fact,  except  in  the  most  compli- 
cated cases,  one  can  proceed  from  Pj,  P2,  .  .  .  ,  Pn,  K(a)  \-c  F(a)  to 
Pi,  P2,  .  .  .  ,  Pn,  K{a)  \-  F(a),  and  then  proceed  as  indicated.  Our  proof  of 
the  version  of  Thm.VI.7.5  with  restricted  quantification  is  an  instance 
of  this. 

However,  in  very  complicated  cases,  we  shall  have  to  keep  the  \-c-    In 
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such  cases,  what  we  need  is  a  generalized  deduction  theorem.  If  we  could 
proceed  from  P„  P„  .  .  .  ,  P.,  K{cx)  he  i^(«)  to  P„  P^,  .  .  .  ,  P„  he  K{a)  D 
F{a),  then  we  could  proceed  from  K(a)  D  F{a)  to  (a)  F(a)  by  rule  G 
(unless  there  were  prohibitions  on  the  use  of  rule  G  with  a,  in  which  case 
one  wouldn't  expect  to  be  able  to  prove  Pi,  P2,  .  .  .  ,  P„  he  («)  F{a)).  We 
now  state  and  prove  this  generalized  deduction  theorem.  Since  we  expect 
to  follow  the  use  of  our  generalized  deduction  theorem  by  a  use  of  rule  G, 
we  shall  need  to  have  information  on  the  uses  of  rule  C,  and  this  information 
is  included  in  the  theorem. 

**Theorem  VI.8.1.  Suppose  there  is  a  demonstration  of  Pi,  P2,  ■  ■  .  ,  P„, 
Ql-c  R  in  which  there  are  m  uses  of  rule  C.  Moreover,  let  Pi  (2/1),  P2(?/2), 
.  .  •  ,  PmiVm)  be  the  steps  resulting  from  uses  of  rule  C,  and  let  y^,  2/2,  •  •  •  ,  2/m 
be  the  corresponding  i/'s  with  which  rule  C  is  used.  Then  there  is  a  demon- 
stration of  Pi,  P2,  .  .  .  ,  P„  he  Q  ^  -K  in  which  there  are  m  uses  of  rule  C,  and 
Q  D  Pi  (2/1),  Q  ^  P2(2/2),  •  ■  ,Q  ^  Fm(ym)  arc  the  steps  resulting  from  these 
uses  of  rule  C,  and  yi,  yz,  ■  ■  ■  ,  Vm  are  the  corresponding  y's. 

Proof.     The  proof  is  quite  like  that  of  the  original  deduction  theorem 
(Thm.IV.6.1).     If  aSi,  S2,  .  .  .  ,  Ss  are  the  steps  of  the  demonstration  of 
Pi,  P2,  .  .  .  ,  Pn  Q  he  R,  then  we  take  Q  D  S„  Q  D  S2,  .  .  .  ,  Q  D  S,  to  be 
key  steps  in  a  demonstration  of  Pi,  P2,  .  .  .  ,  P„  he  Q  ^  R-    We  now  show 
by  induction  on  i  that  up  to  Q  D  Si  one  can  fill  in  additional  steps  so  that 
the  demonstration  is  complete  up  to  that  point,  and  that  any  uses  of  rule  C 
which  have  occurred  up  to  this  point  are  used  to  produce  those  of  Q  D  Pi  (2/1), 
Q  D  Fiiyz),  ■  ■  .  which  have  occurred  up  to  and  including  the  step  Q  D  Si. 
We  take  care  of  those  cases  in  which  Si  is  an  axiom,  or  a  P,  or  Q,  or  a  repeti- 
tion, or  the  result  of  modus  ponens  exactly  as  in  the  proof  of  Thm.IV.6.1. 
Incidentally,  this  takes  care  of  the  case  i  =  1,  so  that  our  induction  is  now 
started.    Also,  all  the  cases  treated  so  far  involve  no  uses  of  rule  G  or  rule  C. 
Now  consider  the  case  where  Si  arose  from  a  use  of  rule  G.     Then 
Si  is  (x)  Sj  where  j  <  i.    Also,  x  must  not  occur  free  in  any  of  Pi,  P2,  .  .  .  , 
P„,  Q,  FiiVv),  F^iy,),  ...,  F^{y^),  where  F,(y,),  P2(2/2),  •  •  ■  ,  F^(y„)  are  the 
results  of  rule  C  which  occur  previous  to  Si  in  the  demonstration  of 
Pi,  P2,  .  .  .  ,  P„,  Q  he  R-    Then  in  the  demonstration  of  Pi,  P2,  .  .  .  ,  P„  he 
Q  D  R,we  have  up  to  this  point  had  the  steps  Q  D  Pi(2/i),  Q  3  P2(2/2), 
.  .  .  ,  Q  D  Faiyc)  produced  by  rule  C.    None  of  these  contains  free  occur- 
rences of  X.  Also  Pi,  P2  ,  .  .  .  ,  P„  do  not  contain  free  occurrences  of  x.    So 
the  restrictions  for  the  use  of  rule  G  are  satisfied.    Hence  we  apply  rule  G 
to  Q  D  Si,  getting  {x).Q  D  S^.    Now  by  Thm.VI.6.6,  Part  IX,  there  is  a 
demonstration  of  h  (^).Q  ^  Sr.  3  :Q  3  (x)  Sj.    So  we  insert  the  steps  of 
this  demonstration,  and  then  the  step  Q  D  (x)  S,-  follows  by  modus  ponens. 
As  (x)  Sj  is  *S,-,  we  now  have  the  step  Q  D  Si. 

Now  consider  the  case  where  Si  arose  from  a  use  of  rule  C.    Then  Si  is 
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l\{y„)  and  there  is  an  Si  with  j  <  i  such  that  *S',-  is  (Ea:„)  F„(a;„).  Also, 
y^  does  not  occur  free  in  any  of  Pi,  P2,  .  .  ■  ,  Pn,  Q,  FiiVi),  ^2(2/2),  •  •  •  , 
Fa-iiVa-i)-  So?/„  does  not  occur  free  in  any  of  Pi ,  P2,  .  .  .  ,  P„,  Q  D  Pi(?/i), 
Q  D  Fziy^),  •  ■  ■ ,  Q  ^  Pa-i{ya-i),  and  the  restrictions  on  the  use  of  rule  C 
are  satisfied.  We  already  have  Q  D  (Ex„)  Fa(Xcc)  in  our  demonstration.  By 
Thm.VI.6.8,  Part  II,  we  can  fill  in  steps  to  get  Q  D  (E?/„)  Faiya).  Then 
by  Thm.VI.6.6,  Part  X,  we  can  fill  in  steps  to  get  (E?/„).Q  D  FaiVa)-  Then 
by  rule  C,  we  get  Q  D  Faiya)-    This  is  Q  D  5,. 

With  the  aid  of  this  theorem,  we  can  operate  with  restricted  quantifiers 
in  a  manner  strictly  analogous  to  the  manner  with  which  we  can  operate 
with  unrestricted  quantifiers.  The  only  point  to  keep  in  mind  is  that  no 
restrictions  can  be  understood  in  connection  with  free  occurrences  of  a 
letter,  so  that,  if  we  are  dealing  with  restricted  quantification  for  the  letter 
a,  then  we  shall  encounter  K{(x)&F(a)  or  K{a)  D  F{a)  in  places  where  we 
would  have  only  F(a)  if  we  were  using  unrestricted  quantification. 

The  use  of  restricted  quantification  casts  some  light  on  the  question  of 
unorthodox  converses  (see  the  end  of  Sec.  4  of  Chapter  II).  It  is  natural 
to  take  (x).Q  D  P  as  the  converse  of  {x).P  D  Q.  If  we  follow  this  same  rule 
with  restricted  quantification,  we  get  (a).G{a)  D  F{a)  as  the  converse  of 
(a).F{a)  D  G(a).  That  is,  {x):K{x).  D  .G(x)  D  F(x)  is  the  converse  of 
{x):K{x).  D  .F(x)  D  G{x).  If  we  leave  off  the  initial  {x),  as  is  customary 
in  everyday  mathematics,  we  get  K{x).  D  .G{x)  D  F{x)  as  the  converse  of 
K{x).  D  .F{x)  D  G{x).  From  this,  it  is  but  a  step  to  considering  P  D 
(P  D  Q)  to  be  the  converse  of  P  D  (Q  D  R)  in  cases  where  P  is  noticeably 
simpler  than  Q  or  R. 

Let  us  see  how  much  simpler  the  definition  of  continuity  becomes  if  we 
use  restricted  quantification.  We  shall  agree  for  the  moment  that  5  and  e 
denote  positive  real  numbers.  Then  we  can  write  the  definition  of  "f  is 
continuous  at  the  point  c"  as 

(£)(E5)(a:):|  x  -  c  \  <  8.  D  .\  f{x)  -  f(c)  \  <  e. 

A  similar  simplification  will  ensue  whenever  we  are  dealing  with  quanti- 
ties from  an  assigned  range  of  values. 

EXERCISES 

VI.8.1.  If  a  and  /S  denote  quantities  subject  to  the  restriction  K(a)  and 
7  denotes  a  quantity  subject  to  the  restriction  L{y),  prove: 

(a)  \-(a)  P  D  (Ea)  P. 

(b)  h  (Ea)  F(a).  =  .(Ea,/3).P(a)vP(iS). 

(c)  h  (Ea)  P. (a)  Q.   D   .(Ea).PQ. 

(d)  h  (a,7).P  D  Q:D   :(a,7)  P.   3   .(a,y)  Q. 
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(e)  h  («)(E7).P  D  Q:D   :(Ea)(7)  P.   D   .(Ea,7)  Q. 

(f)  \-  (a).PwQ:  D  :(a)  P.v.(Ea)  Q. 

(g)  h  («)  P.v.(Ea)  Q:   3  :(Ea).FvQ. 
(h)  h  (E«)  P.  D  .(a)  Q:  D  :{a).P  D  Q. 

(i)       h  (Ea)  P.   D   .(Ea)  Q:   D  :(Ea).P  D  Q. 
(j)       [-  (a)  P.   D   .(a)  Q:   D   :(Ea).P  D   Q. 

VI.8.2.  On  November  22,  1948,  in  Pop's  place,  three  consecutive  song 
titles  in  the  juke  box  were  "Everybody  loves  somebody,"  "I  never  loved 
anyone,"  and  "Somebody  loves  me."  In  order  to  avoid  complications  due 
to  tenses,  let  us  rewrite  the  second  as  "I  do  not  love  anyone."  Let  us  refer 
to  them,  in  the  order  given,  as  A,  B,  and  C.  Using  "I"  to  stand  for  "I"  or 
"me"  according  to  grammatical  position,  and  L{x,y)  to  stand  for  "x  loves 
?/,"  translate  each  of  the  above  song  titles,  using  x,  y,  and  z  as  variables 
with  their  ranges  restricted  to  people.  State  and  prove  all  provable  impli- 
cations of  the  form  \-W  D  V,  where  each  of  W  and  V  is  one  of  A,  ~A,  B, 
^B,  C,  or  ~C. 

VI.8.3.  Using  restricted  quantification  translate  "For  every  positive 
e  there  is  a  positive  integer  m  such  that  for  every  greater  positive  integer  n, 
I  a„  —  a„  I  <  e"  into  symbolic  logic;  also  write  the  dual  of  this  statement. 

VI.8.4.  Prove  that  [-  (ai,  0:2,  ...  ,  a„).P  ^  Q:  ^  :(«!,  «2,  •  •  •  ,  «„)  P.  3  . 
(tti,  a2,  .  ■  .  ,  a„)  Q,  regardless  of  whether  any  of  the  a's  are  restricted  or 
whether  the  restricted  a's  have  the  same  restriction  or  not. 

VI.8.5.  Let  P  have  no  free  occurrences  of  any  variable.  Let  Q  be  got 
from  P  by  replacing  all  quantifiers  of  P  by  restricted  quantifiers,  all  having 
identical  restrictions.  Show  that,  if  P  can  be  deduced  from  Axiom  schemes 
1  to  6  by  modus  ponens,  then  so  can  Q.  (Hint.  Let  *Si,  aS'2,  •  •  •  ,  *S,  be  a 
demonstration  of  \-  P,  using  only  Axiom  schemes  1  to  6.  Let  S^  be  related  to 
Si  as  Q  is  related  to  P.  Let  a^i,  .  .  .  ,  a:„  be  all  variables  which  occur  free  in 
any  of  *S,,  .  .  .  ,  *S^.  Using  Ex.VL8.4  at  the  points  where  modus  ponens  was 
used  with  the  ^'s,  prove  \-  (x^,  .  .  .  ,  x^)  2.-  for  1  <  i  <  s,  where  in  (a^i,  .  .  .  , 
x„)  2i  all  quantifiers  have  the  same  restrictions  as  in  Q.  Then  get  |-  Q  from 
j-  (xi,  .  .  .  ,  Xn)  Qhy  the  generahzed  form  of  Thm.VL6.6,  Part  L) 

VI.8.6.  Indicate  that  the  result  of  the  preceding  exercise  would  fail  to 
hold  generally  if  different  quantifiers  in  Q  are  allowed  to  have  different 
restrictions,  and  state  where  the  proof  in  the  preceding  exercise  would 
break  down.    (Hint.    Take  P  to  be  (x)  F(x).  =  .(v)  F(y)-) 

9.  Applications  to  Everyday  Mathematics.  At  the  end  of  Chapter  II, 
we  listed  a  number  of  rules  of  everyday  logic  which  could  be  derived  by 
truth  tables.  We  are  now  in  a  position  to  extend  the  list  of  rules  of  everyday 
logic  considerably  by  giving  rules  derived  from  results  of  the  present  chap- 
ter.   Perhaps  most  important  are  the  intuitive  equivalents  of  rule  G  and 
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rule  C.  However,  we  need  not  discuss  them  further  here,  since  we  have 
already  discussed  them  at  length  in  Sees.  4  and  7  of  the  present  chapter. 
The  duality  theorem  is  very  useful,  and  indeed  special  cases  of  it  are  in 
common  use  in  analysis.  By  means  of  the  duality  theorem,  we  can  write 
variants  of  the  principle  "If  ~Q  D  ~P,  then  P  D  Q,"  to  wit: 

If  Q*  D  ~P,  then  P  J  Q. 
If  ^Q  D  P*,  then  P  D  Q. 
If  Q*  D  P*,  then  P  D  Q. 

Here  P*  and  Q*  denote  the  duals  of  P  and  Q.    Similarly  (as  noted  earlier) 
we  get  variants  of  the  principle  of  reductio  ad  absurdum,  to  wit: 

Assume  P*  and  deduce  Q^^Q. 
Assume  ^P  and  deduce  QQ*. 
Assume  P*  and  deduce  QQ*. 

One  can  write  similar  variants  of  various  other  of  the  principles  by  replac- 
ing ^P  by  P*,  '^Q  by  Q*,  etc.  We  mention  one  particularly  useful  one, 
namely,  the  proof  of  P  =  Q  by  proving  each  oi  P  D  Q  and  P*  3  Q*. 

By  the  deduction  theorem,  we  get  [-  P  D  Q  whenever  we  can  show  P\-Q. 
This  is  very  useful  when  taken  in  conjunction  with  various  powerful  ways 
of  inferring  P\-Q,  notably  Thm.VI.7.2.  The  reader  is  reminded  that  he  can 
also  infer  \- P  D  Q  from  any  of  ~Q  \-  ~P,  Q*  \-  ~P,  ~Q  h  P*,  or  Q*  \-  P*. 

In  this  connection,  there  is  an  important  consideration  having  to  do  with 
proofs  by  reductio  ad  absurdum.  To  prove  |-  P,  it  suffices  to  prove 
\-  '^P  D  Q'^Q,  and  so  it  suffices  to  prove  '^P  \-  Q'^Q,  for  which  it  suffices 
to  prove  '~P  \-c  Q'^Q  if  there  are  no  free  yi,  y2,  •  •  •  ,  ?/m  in  Q  (where 
Vi,  y2,  •  •  •  ,  Vm  are  the  y's  used  with  rule  C  in  the  proof  of  '^P  \-c  Q'^Q). 
The  important  point  to  note  is  that,  in  the  special  case  under  consideration, 
it  is  permissible  that  some  or  all  of  yi,  yz,  .  .  .  ,  y^  occur  free  in  Q.  For  from 
'^P  \-c  Q'^Q,  we  get  '^P  \-c  R'^R  without  additional  uses  of  rule  G  or  rule 
C  by  Thm.VI.6.1,  Part  LXV.  This  can  be  done  regardless  of  what  we  take 
R  to  be,  and  so  we  can  choose  an  R  with  no  free  occurrences  of  y^,  y-z,  ...,?/„. 
For  instance,  we  could  take  P  to  be  (?/i,  2/2,  •  •  •  ,  2/m)  S.  So  to  infer  \-  P,  it 
suffices  to  prove  -^P  \-c  Q'^Q  quite  regardless  of  what  variables  occur  free 
in  Q. 

The  duality  theorem  and  its  corollary  give  a  powerful  means  of  proving 
equivalences,  which  are  then  useful  either  in  their  0"\vn  right  or  in  connec- 
tion with  the  substitution  theorem.  One  can  often  simplify  the  proof  of  a 
theorem  by  replacing  some  portion  of  it  by  an  equivalent  portion,  which  is 
permitted  by  the  substitution  theorem. 

To  prove  statements  of  the  form  (x)  P,  we  have  Thm.VI.4.1  and 
Thm.VI.4.2.    More  generally,  we  can  apply  rule  G  and  then  use  Thm.VI.7.1 
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or  VI. 7. 2.  An  alternative,  and  not  uncommon,  procedure  is  to  use  reductio 
ad  absurdum,  of  which  the  most  common  variation  is  to  start  with  the  dual 
of  (x)  P,  namely,  (Ex)  P*,  use  rule  C,  and  then  proceed  to  a  contradiction. 

With  the  axioms  presently  at  our  disposal,  we  cannot  prove  (Ex)  P  for 
any  P's  of  real  interest.  The  only  procedure  we  have  as  yet  for  proving 
statements  of  the  form  (Ex)  P,  except  for  general  methods  such  as  reductio 
ad  absurdum,  is  that  furnished  by  Thm.VI.7.3.  Among  the  axioms  which 
we  shall  later  add  will  be  some  whose  primary  purpose  is  to  prove  additional 
statements  of  the  form  (Ex)  P. 

We  now  give  a  number  of  illustrations  from  everyday  mathematics. 

Theorem.     The  derived  set  of  a  point  set  is  closed. 

We  use  restricted  quantification,  letting  8  and  s  denote  positive  real 
numbers,  and  x,  y,  and  z  denote  points.  We  also  use  the  notation  xeato 
denote  that  x  is  a  member  of  a.    We  recall  some  definitions. 

"x  is  a  limit  point  of  a  if  for  every  positive  s  there  is  a  point  of  a  different 
from  X  whose  distance  from  x  is  less  than  e." 
In  symbols: 

{z)(Ey).y  9^  x.y  e  a.\  x  -  y  \  <  s. 

"a  is  a  closed  set  if  all  limit  points  of  a  are  in  a." 
In  symbols : 

(x):.(e)(Ey).y  9^  x.y  e  a.\  x  —  y  \  <  e:  D  :X  e  a. 

"^  is  the  derived  set  of  a  if  /3  consists  of  all  limit  points  of  a." 
In  symbols: 

(1)  {x):.{e)(Ey).y  9^  x.y  e  a.\  x  -  y  \  <  e:  ^  :X  e  0. 

Proof.     To  prove  our  theorem,  we  must  show  that  from  (1)  we  can  infer 

{x):.{B)(Ey).y  9^  x.y  e^.\x  -  y\  <  e:  D  :xe^. 

Let  us  first  give  the  proof  in  words,  as  it  might  appear  in  a  text  on  analy- 
sis. 

Let  X  be  a  limit  point  of  /3.  Let  s  >  0.  Then  e/2  >  0,  and  we  can  choose 
a  point  2/  in  i3  different  from  x  whose  distance  from  x  is  less  than  e/2.  Since 
y  9^  x,\x  —  y\  >  0,  and  so  since  y  is  in  /3,  and  hence  is  a  limit  point  of  a, 
we  can  choose  a  point  2;  in  a  different  from  y  whose  distance  from  y  is  less 
than  \x  -  y\.  Since  \y  -  z\  <\x-  y\,we  have  z  9^  xand\y  -  z\  < 
e/2.  So\x-z\  =  \(x-y)  +  (y-z)\<\x-y\-{-\y-z\< 
£/2  +  e/2  <  c.  So  we  have  found  a  2  in  a  different  from  x  with  |  x  —  2  |  <  e. 
Hence  x  is  a  limit  point  of  a,  and  so  x  is  in  /S. 

We  now  give  a  formal  proof,  filling  in  the  full  logical  details.  Insertion 
of  the  full  details  lengthens  the  proof  considerably. 
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Let  us  assume  that  jS  is  the  derived  set  of  a, 

(1)  {x)i.{^){^y).y  9^  x.y  €  a.\  X  -  y  \  <  e:  =  :x  e  ^, 
that  a;  is  a  limit  point  of  /3, 

(2)  (e)(E?/).t/  7^  x.y  e  ^.\  X  -  y  \  <  e, 
and  that  e  is  positive, 

(3)  £  >  0. 
By  (3),  we  get 

(4)  e/2  >  0. 
By  (2),  Axiom  scheme  6,  and  (4),  we  get 

(5)  {^y).y  9^  x.y  e  I3.\  X  -  y  \  <  e/2. 

The  procedure  which  we  carried  out  to  get  step  (5)  is  standard,  but  as  this 
is  our  first  use  of  it  we  shall  go  through  it  again  in  slow  motion. 
Since  we  are  using  restricted  quantification,  (2)  signifies 

(w):w  >  0.  3  .(Ey).y  7^  x.y  e  fi.\  X  —  y  \  <  w. 

Now,  by  Axiom  scheme  6, 

h  {w)  F(w).  D  .F(^/2), 
and  so  we  get 

s/2  >  0.  3  .(E'y).y  7^  x.y  t^.\x  -  y\  <  e/2. 

Now  by  (4)  and  modus  ponens  we  get  (5) . 

Applying  rule  C  to  (5),  and  recalling  that  we  are  dealing  with  restricted 
quantification,  we  get 

(6)  y  is  a  point, 

(7)  y  9^  X, 

(8)  y  e  /3, 

(9)  \x-  y\  <  e/2. 
From  (1),  we  get 

(x):.(e)(Ez).z  9^  x.z  e  a.\  X  —  z  \   <  e-.  ^  :X  €  ^. 
From  this  by  Axiom  scheme  6  and  (6)  we  get 

(£)(E2;).2:  9^  y.z  e  a.\  y  —  z  \   <  e:  =  :i/  e  /?. 
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From  this  and  (8),  we  get 

(10)  (e)(E2).2  9^  y.z  €  a.\  y  -  z  \  <  e. 

From  (7),  we  get 

(11)  \x  -  y\  >  0. 
From  (10)  by  Axiom  scheme  6  and  (11),  we  get 

(12)  {Ez).2  9^  y.z  e  a.\  y  -  z  \  <  \  X  -  y  \. 

By  rule  C, 

(13)  2  is  a  point, 

(14)  z  y^y, 

(15)  z  €  a, 

(16)  \y  -  z\  <\x  -  y\. 
By  (16), 

(17)  Z   9^  X. 

By  (16)  and  (9)  ' 

(18)  \y  -  z\  <  e/2. 

By  (18)  and  (9), 

(19)  \x  -  z\  <  e. 

(Here  we  have  skipped  a  couple  of  purely  mathematical  steps.    See  our 
verbal  proof.)    Collecting  (13),  (17),  (15),  and  (19),  we  have 

(20)  2  is  a  point  .z  9^  x.z  e  (x.\  x  —  z  \  <  e. 
So  by  Thm.VI.7.3, 

(21)  (Ez).z  9^  x.z  e  a.\  x  —  z  \  <  s. 
So  by  Thm.VI.6.8,  Part  II, 

(22)  (%).?/  9^  x.y  e  a.\  X  -  y  \  <  e. 

We  have  shown 

(1),  (2),  (3)  he  (22). 

As  neither  y  nor  z  occurs  free  in  (22),  we  get 

(1),  (2),  (3)  h  (22). 
That  is, 

(1),  (2),  e  >  0  h  (Ey).y  9^  x.y  e  a.\  x  -  y  \  <  e. 
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So  by  the  deduction  theorem  and  the  generalization  theorem 

(1),  (2),  1-  (e)(Ey).y  9^  x.y  e  a.\  x  -  y  \  <  e 

However,  by  Axiom  scheme  6, 

(1),  a;  is  a  point  |-  (e){Ey).y  9^  x.y  e  a.\  x  —  y  \  <  e:  ^  :x  e  /3. 

So 

(1),  a:  is  a  point,  (2)  \-  x  e  /?. 

Then  by  the  deduction  theorem, 

(1),  a:  is  a  point  [-  (2)  D  x  e  fi. 
That  is, 

(1),  a:  is  a  point  \-  (e)(Ey).y  9^  x.y  i  ^.\  x  —  y  \  <  s:  D  rx  e  /3. 
Then  by  the  deduction  theorem  and  the  generahzation  theorem, 
(1)  h  {x)'..{^){Ey).y  ^  x.y  e^.\x  -  y\  <  e:  D  :xe^. 

Then  our  theorem  follows  by  the  deduction  theorem. 

We  now  give  a  more  complicated  instance. 

Theorem,  If  in  a  region  R  each  /„  is  continuous,  and  /„  converges  uni- 
formly to  f  in  R,  then  /  is  continuous  in  R. 

We  use  restricted  quantification,  letting  8  and  £  denote  positive  real 
numbers,  x,  y,  and  z  denote  points  of  R,  and  m  and  n  denote  positive 
integers.    Then 

(x,^)(E8){y):\  X-  y\  <  8.D  .\  f{x)  -  f(y)  \  <  e 
is  the  statement  that  /  is  continuous  in  R,  and 

(e)(En){m,x):m  >  n.  D  .\  f(x)  -  f„(x)  \  <  £ 

is  the  statement  that  /„  converges  uniformly  to  f  in  R. 

Proof.  Let  us  first  give  the  proof  in  words,  as  it  might  appear  in  a  text 
on  analysis  or  advanced  calculus. 

Given  £  >  0,  choose  n  so  that,  for  m  >  n,  \  f(x)  —  /„(a;)  |  <  e/3.  Then 
choose  8  so  that,  for  |  a:  —  y  |  <  5,  |  /„+i(a:)  —  /„+i(y)  |  <  e/3. 

Then 

I  fix)  -  fiy)  I 

=  1  if(x)  -  U,(x))  +  (/„,,(a:)  -  /„,,(?/))  4-  ifn.^iy)  -  Ky))  I 

<  I  fix)  -  f^,,{x)  I  +  I  /„.,(a:)  -  /„.,(2/)  |  +  |  f(y)  -  fn.M  i 

<  s/3  +  e/3  +  £/3  <  £. 

So,  whenever  \x  —  y\  <  8,\  f{x)  —  f{y)\  <  e,  and  so  /  is  continuous  in  R. 
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Let  us  now  give  a  formal  proof,  filling  in  the  full  logical  details,  which 
considerably  lengthens  the  proof.  We  let  n  e  PI,  x  e  R,  and  e  >  0  denote 
"n  is  a  positive  integer,"  "x  is  a  point  of  R,"  and  "e  is  a  positive  real  num- 
ber."   We  first  prove: 

Lemma,  x  e  R,  {x).\  j{x)  -  /„+i(a;)  1  <  e/3,  {y):\  x  -  y  |  <  5.  D  . 
I  f...{x)  -  /„.,(?/)  I  <  e/3  h  (2/):|  x-y\  <b.D  .\  j{x)  -  J{y)  \  <  s. 

Proof.     Assume 

(i)  X  eR, 

(ii)  {x).\  fix)  -  /„,i(a:)  I  <  e/3, 

(iii)  (y):\  X-  y\  <  8.D  .\  f^.,{x)  -  f^^M  I  <  e/3, 

(iv)  y  eR, 

(v)  \x  -  y\  <  b. 

By  (ii),  Axiom  scheme  6,  and  (i),  we  get 
(vi)  I  fix)  -  /„,i(x)  I  <  s/3. 

By  (ii),  Axiom  scheme  6,  and  (iv),  we  get 
(vii)  1/(2/)  -  /„.,(2/)  I   <  e/3. 

By  (iii).  Axiom  scheme  6,  and  (iv),  we  get 

\x  -  y\  <  b.D  .\  f^.,{x)  -  /„,i(2/)  I  <  e/3, 

so  that  by  (v),  we  get 

(viii)  I  fn^x{x)  -  fn^iiy)  I  <  e/3. 

By  (vi),  (vii),  and  (viii),  together  with  certain  mathematical  theorems, 
we  get 

(ix)  I  fix)  -  fiy)  I  <  e. 

So  we  have  shown 

(i),  (ii),  (iii),  yeR,\x-y\  <  d\-\  fix)  -  fiy)  \  <  e. 

So  two  uses  of  the  deduction  theorem  give 

(i),  (ii),  (iii)  ^y  eR.D  :\x  -  y\  <  8.D  .\  fix)  -  fiy)  \  <  e. 
Then  we  can  use  the  generalization  theorem  to  infer 

(i),  (ii),  (iii)  h  (y):|  x-y\<  b.D  .\  fix)  -  fiy)  \  <  e. 
This  is  our  lemma. 
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Now  assume  that  each  /„  is  continuous  in  R, 

(1)  (n,x,e)(E5)(y):|  x  -  y  \  <  8.  D  .\  j^{x)  -  JM  I  <  e, 
that  /„  converges  uniformly  to  /  in  72, 

(2)  {e)(En){m,x):m  >  n.  D  .\  fix)  -  f„{x)  \  <  e, 
that  a;  is  a  point  of  R, 

(3)  X  e  R, 
and  that  e  is  positive, 

(4)  e  >  0. 
Then 

(5)  e/3  >  0, 
and  so  by  (2), 

(6)  (En){m,x):m  >  n.  D  .\  fix)  -  f^ix)  \  <  e/3. 

Then  by  rule  C, 

(7)  nePI, 

(8)  im,x):m  >  n.  D  .\  fix)  -  f^ix)  \  <  s/3. 
From  (8),  by  Thm.VI.6.6,  Part  IX, 

(9)  im)'.m  >  n.  D  .(x).|  fix)  -  f^ix)  \  <  e/3. 

Now,  by  (7),  we  get  n  +  1  e  PI,  so  that  by  (9)  we  have 

(10)  n  +  1  >  w.  D  .(x).|  fix)  -  f^,,ix)  I  <  e/3. 
Also  by  (7),  n  +  1  >  n,  so  that 

(11)  {x).\  fix)  -  f^,,ix)  I  <  e/3. 
Since  n  +  1  e  P/  by  (7),  we  get  by  (1), 

(x,e)(E5)(7/):|  x-y\  <  b.D  .\  f..,ix)  -  f^.,iy)  \  <  e. 

Then  by. (3), 

(e)(E5)(^):i  X-  y\  <  8.D  .\  f,.,,ix)  -  f.,,iy)  \  <  e. 
Then  by  (5), 

(E5)(^):|  X-  y\  <  8.D  .\  f„,,ix)  -  f.,,iy)  \  <  e/3. 
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So  by  rule  C 

(12)  5  >  0, 

(13)  (y):\  X  -  y\  <  5.D  .\  f„,,(x)  -  f„,,(y)  \  <  e/3. 

Now  by  our  lemma,  (3),  (11),  and  (13),  we  get  (with  no  additional  uses 
of  rule  G  or  rule  C), 

(14)  (y);\  X  -y\  <  8.D  .\  f(x)  -  f{y)  \  <  s. 
So  by  Thm.VI.7.3, 

(15)  (E5)(2/):|  X  -  y\  <  d.D  .\  f(x)  -  f(y)  |  <  e. 
We  have  now  shown 

(1),  (2),  xeR,s>0^c  (E5)(2/):|  x  -  y  \<  8.  D  .\  f(x)  -  f(y)  \  <  e. 

As  neither  n  nor  5  occurs  free  in  the  conclusion  of  this,  we  can  replace 
[-C  by  \-.  Then  by  the  deduction  theorem  and  the  generalization  theorem, 
we  infer 

(1),  (2),  xeR\-  ie){E8)(y):\  x  -  y  \  <  8.  D  .\  f{x)  -  f(y)  |  <  e.     ' 
Using  the  deduction  theorem  and  generalization  theorem  again  gives 

(1),  (2)  h  (x,e)(E5)(y):|  x-y\  <  8.D  .\  f{x)  -  f{y)  \  <  e. 

Our  theorem  now  follows  by  two  more  uses  of  the  deduction  theorem. 

By  using  Thm.VI.8.1,  we  could  have  avoided  having  to  prove  a  prelimi- 
nary lemma  in  the  previous  proof.  That  is,  by  using  Thm.VI.8.1,  we  could 
have  kept  the  formal  proof  closer  to  the  intuitive  proof. 

We  have  filled  in  full  logical  details  in  the  above  proofs  to  illustrate  the 
logical  principles  involved.  The  reader  should  not  suppose  that  it  is  neces- 
sary to  fill  in  the  full  logical  details  in  order  to  have  a  correct  formal  proof. 
All  that  is  necessary  is  to  furnish  enough  details  so  that  the  reader  can 
reconstruct  a  proof  with  full  details  if  called  upon  to  do  so.  How  many 
details  are  required  for  this  purpose  depends  on  the  reader's  experience. 
As  the  reader  becomes  more  experienced,  fewer  details  need  to  be  supplied. 
It  will  be  our  policy  to  give  fewer  and  fewer  details  as  we  proceed  and  the 
reader  acquires  experience.  Thus,  at  first  we  wrote  out  demonstrations  in 
full,  then  we  quickly  changed  to  merely  indicating  key  steps,  and  we  now 
give  fewer  and  fewer  of  these.  Even  in  the  last  two  proofs,  where  we  were 
supposedly  giving  full  details,  we  have  progressively  diminished  the  amount 
of  detail  given.  For  example,  in  the  first  proof,  we  gave  the  full  details  of 
how  to  get  from  steps  (2)  and  (4)  to  (5).  In  the  second  proof  we  make  the 
analogous  step  from  (2)  and  (5)  to  (6)  without  comment. 

We  now  present  one  more  illustration  which  is  of  particular  logical  inter- 
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est.    In  his  doctor's  thesis,  M.  D.  Donsker  needed  an  upper  bound  for  the 
Lebesgue  measure  of  the  set  of  all  real  numbers  t  such  that 


(Ex):.(m,n):l  <n  <  N. 


Min  -  1) 

N 


<  m  < 


Mn 

N 


D  .<p(x,m,n):.(Ern,n):l  <  m  <  M.l  <  n  <  N. 


M      ^     -  M'     N 


1  n 

-  <  t  <  —.^<i>{x,m,n), 


where  we  are  using  restricted  quantification  with  x  representing  a  point  in 
many-dimensional  space  and  m  and  n  representing  positive  integers.  M 
and  N  are  fixed  positive  integers,  [w]  denotes  the  greatest  integer  less  than 
or  equal  to  w,  and  <l>{x,m,n)  is  a  very  complicated  statement  involving 
X,  m,  and  n.  The  details  of  4>{x,m,n)  need  not  concern  us.  Let  us  denote 
the  above  condition  on  t  by  F(t),  and  denote  the  set  of  all  t's  such  that  F{t) 
by  T.  Donsker  got  an  upper  bound  on  the  measure  of  T  by  showing  that 
T  is  a  subset  of  a  set  T'  of  all  real  numbers  t  such  that 


(En).l  <n  <  N. 


^^  <t<- 
N      ^  ^-  N 


Mn 

N 


<  Mt. 


We  denote  this  condition  by  G{t).  It  is  not  particularly  difficult  to  get  an 
upper  bound  for  the  measure  of  T' .  So  the  only  difficulty  is  to  show  that 
T  is  indeed  a  subset  of  T' .  For  this  it  is  sufficient  (and  necessarj^  to  show 
{t).F{t)  D  G{t).  This  is  done  by  showing  {t).G*{t)  D  F*{t),  where  G*{t)  and 
F*{t)  are  the  duals  of  F{t)  and  G{t).  Since  ^  is  a  real  number  we  can  write 
G{t)  in  the  equivalent  form 


(En).l  <n  <N. 
Then  G*{t)  has  the  form 


Ml  < 


(n):l  <n<  n'!~~  <  t  <^.D  .Mt  < 


[f] 


Then  (t).G*{t)  D  F*(t)  has  the  form 
{t)::(n):l  <n  <  nJ~^  <  t  <  ^.  D  .Mt  < 


(m,n):l  <  n  <  N 


rM(n 


1) 


<  m  < 


\~J^\-  ^  .4>{x,m,n):  D  : 


{m,n):\  <m<M.l<n<  N  !^  ,^  ^  <  ^  <  77. 
jy      <  t  <  j^.  D  .(t>{x,m,n). 


160 


LOGIC  FOR  MATHEMATICIANS 


[Chap.  VI 


In  spite  of  its  apparent  complication,  this  is  very  easy  to  prove.    We 
assume 

(1)  (n)d  <n<  A'.^^^^  <t<j^.D.Mt< 


L  A^J' 


(2) 

(3)       (m,n):l  <n  <N. 

(4) 
(5) 
(6) 
(7) 

(8) 


(9) 


re  is  a  point, 

'M{n  -  1) 


N 


<  m  < 


Mn 

N 


D  .<f>(x,m,n), 


m  €  PI, 

nePI, 

I  <  m  <  M, 

1  <n  <  N, 

m  —  1    ^      ^  m 
M      ^     -  M' 

N     ^     -  N' 


and  undertake  to  prove  ^{x,m,n).    By  (8)  and  (9), 

M{n  -  1)   ^  , ,,    . 


N 

^  . 

L}(1  i 

2:-. 

rr 

so  that 

(10) 

~M(n  -  1)" 

L        A^ 

<  m. 

Also,  by  (1), 

(5),  (7),  and  (9), 

Mt  < 

Mn 

_  A^_ 

) 

and  so  by  (8), 

m  -  1  < 

Mn 

.  N  ] 

Since  m  and 

Mn 

In  i 

are  each  integers,  we  get 

(11) 

m  < 

_I 

In 

V  _ 

Now  by  (3),  (4),  (5),  (7),  (10),  and  (11),  we  get 
(12)  (}>ix,m,n). 
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Now  {t).G*{t)  D  F*{t)  follows  by  repeated  applications  of  the  deduction 
theorem  and  the  generalization  theorem. 

Actually,  Donsker  used  the  procedure  above  to  discover  what  formula 
he  should  use  for  G{t).  In  the  proof  outlined  above,  (1)  is  G*{i).  To  dis- 
cover G{t),  Donsker  merely  chose  a  G*{t)  which  if  combined  with  (2)  to  (9), 
inclusive,  would  enable  him  to  prove  (i>{x,m,n).  Then  G{t)  resulted  from 
dualizing. 

EXERCISES 

VI.9.1.  Write  out  a  formal  version  of  a  proof  of  the  result  that,  if 
lim  a,,  =  a  and  lim  6„  =  h,  then  lim  (a„  +  6„)  =  a  +  6.    (For  instance,  the 

proof  given  in  Hardy,  1947,  pages  129  to  130.) 

VI.9.2.  In  a  certain  paper,  the  author  was  discussing  a  certain  function 
S{F)  whose  value  was  always  a  positive  integer.  This  author  devoted 
several  pages  to  an  intricate  proof  that  "S(F)  is  odd  is  a  sufficient  condition 
that  (x)  F(x)."  He  then  devoted  several  more  pages  to  an  intricate  proof 
that  "S(F)  is  even  is  a  necessary  condition  that  (Ex)  '^F(x)."  What 
should  one  say  to  this  author? 

10.  Church's  Theorem.  There  is  a  point  in  connection  with  the  axioms 
given  so  far  which  is  not  relevant  to  the  main  purpose  of  this  text,  but  which 
is  of  remarkable  interest.  It  concerns  the  problem  of  deciding  whether  or 
not  a  given  statement  can  be  derived  from  the  six  axiom  schemes  by  use  of 
modus  ponens. 

When  we  had  only  the  first  three  axiom  schemes,  the  corresponding 
problem  was  readily  solvable.  We  recall  that  the  method  of  truth-value 
tables  gave  a  systematic  solution,  namely,  that  if  a  statement  P  always 
takes  the  value  T,  then  it  can  be  derived  from  the  first  three  axiom  schemes 
by  modus  ponens,  but  if  P  ever  takes  the  value  F,  then  it  cannot  be  so 
derived. 

The  problem  of  finding  a  systematic  procedure  which,  for  a  given  set  of 
axioms  and  rules,  will  tell  whether  or  not  any  particular  arbitrarily  chosen 
statement  can  be  derived  from  the  given  set  of  axioms  and  rules  is  called 
the  decision  problem  for  the  given  set  of  axioms  and  rules.  The  method  of 
truth-value  tables  gives  a  solution  to  the  decision  problem  for  the  statement 
calculus  (the  first  three  axiom  schemes  with  modus  ponens).  What  about 
a  solution  to  the  decision  problem  for  the  predicate  calculus  (the  first  six 
axiom  schemes  with  modus  ponens)  ?    So  far  none  has  been  discovered. 

Of  course,  given  a  statement  P,  one  can  search  for  a  demonstration  of  it. 
If  such  a  demonstration  is  found,  then  certainly  P  can  be  derived  from  the 
axioms  by  modus  ponens.    But  suppose  no  demonstration  is  found?    Then 
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it  might  be  the  case  that  no  demonstration  exists,  but  it  could  equally  well 
be  the  case  that  one  had  merely  given  up  looking  too  soon.  Thus,  one  may 
happen  to  settle  the  question  of  derivability  for  a  particular  statement  P 
by  finding  a  demonstration  of  it,  but  if  one  fails  to  find  a  demonstration  one 
can  conclude  nothing  about  the  derivability  of  P.  As  a  solution  of  the 
dejcision  problem  must  be  a  method  which  will  infallibly  decide  the  deriv- 
ability of  any  arbitrarily  chosen  statement  P,  the  method  of  looking  for  a 
demonstration  does  not  constitute  a  solution  of  the  decision  problem, 
however  successful  one  may  be  in  finding  demonstrations  of  particular 
statements. 

As  we  said,  no  solution  of  the  decision  problem  for  the  predicate  calculus 
has  yet  been  discovered. 

We  can  say  much  more.  There  is  no  solution  to  be  discovered!  This 
result  was  proved  by  Alonzo  Church  (see  Church,  1936). 

Let  there  be  no  misunderstanding  here.  Church's  theorem  does  not  say 
merely  that  no  solution  has  been  found  or  that  a  solution  is  hard  to  find. 
It  says  that  there  simply  is  no  solution  at  all.  What  is  more,  the  theorem 
continues  to  hold  as  we  add  further  axioms  (unless  by  mischance  we  add 
axioms  which  lead  to  a  contradiction,  so  that  every  statement  becomes 
derivable).  Applying  Church's  theorem  to  our  complete  set  of  axioms  for 
mathematics,  we  conclude  that  there  can  never  be  a  systematic  or  mechan- 
ical procedure  for  solving  all  mathematical  problems.  In  other  words,  the 
mathematician  will  never  be  replaced  by  a  machine. 

11.  A  Convention  Concerning  Bound  Variables.  Although  we  permit 
such  expressions  as  (x){x)  P,  {x)(Ex)  P,  etc.,  for  the  sake  of  complete 
generality,  they  serve  no  useful  purpose,  and  we  shall  encounter  them 
almost  not  at  all.  So  we  shall  agree  that  if  we  write  a  formula  (x)  P,  then 
if  any  variables  occur  bound  within  P  they  shall  be  distinct  from  x  unless 
we  specifically  state  otherwise.  In  particular,  if  we  write  {x,y)  Q,  (x)  (Ez)  Q, 
{x,y,z)  Q,  (Ey,z)  Q,  etc.,  then  it  shall  be  understood  that  x  and  y,  or 
X  and  z,  or  x,  y,  and  2,  or  y  and  z,  etc.,  are  distinct  variables  unless  explicitly 
stated  otherwise. 


CHAPTER  VII 
EQUALITY 

1.  General  Properties.  We  introduce  equality  and  use  the  familiar 
notation  x  ^  y.  If  we  were  attaching  meanings  to  our  statements,  the 
meaning  oix  =  y  would  be  that  x  and  y  are  two  names  of  the  same  identical 
object.  We  place  no  restrictions  on  the  nature  of  the  object,  so  that  we 
shall  have  equality  not  only  between  numbers  (names  of  numbers,  really), 
as  is  common  in  mathematics,  but  between  sets,  or  between  functions,  or 
indeed  between  the  names  of  any  logical  object. 

We  utilize  the  familiar  symbolism,  x  9^  y,  for  '^{x  =  y).  In  the  matter 
of  parentheses,  we  shall  enclose  x  =  y  or  x  9^  y  m  parentheses  whenever 
necessary  to  avoid  confusion,  but  if  no  parentheses  are  written,  one  is  to 
understand  the  parentheses  with  the  smallest  possible  scope. 

If  X  and  y  are  variables,  the  indicated  occurrences  of  x  and  y  are  free  in 
X  =  y,  and  no  other  occurrences  of  any  variable  are  free  in  x  =  y.  The 
significance  of  these  remarks  will  become  clear  when  we  introduce  a  defini- 
tion oi  x  =  y  in  Chapter  IX. 

The  familiar  property  of  equals  is : 

"A  quantity  may  be  substituted  for  its  equal  in  any  equation  or  in- 
equality." 

We  take  it  in  the  more  general  form : 

"A  quantity  may  be  substituted  for  its  equal  in  any  statement." 

In  symbols,  this  takes  the  form: 

Axiom  scheme  7A.  Let  Xi,  X2,  .  .  .  ,  x„,  x,  y,  z  be  variables,  of  which 
X,  y,  and  z  are  distinct,  but  of  which  Xi,  X2,  .  .  .  ,  x„  need  not  be  distinct 
either  from  each  other  or  from  x,  y,  or  z.  Let  P  be  a  statement  which 
contains  no  bound  occurrences  of  x  or  y.  Let  Q  and  R  be  the  results  of 
replacing  all  free  occurrences  of  2  in  P  by  occurrences  of  x  and  y,  respec- 
tively.   Then 

(xi,  ^2,  .  .  .  ,  Xn){x,y):x  =  y.  D  .Q  D  R 
is  an  axiom. 

That  is,  if  P  is  F{x,y,z),  then  Axiom  scheme  7A  says: 

(x,y):x  =  y.  D  .F{x,y,x)  D  F(x,y,y). 

In  words,  if  x  =  y,  then  one  may  replace  some  (or  none  or  all)  of  the  x's 
of  any  statement  F{x,y,x)  by  y's  (getting  F{x,y,y)). 

163 
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The  requirement  that  P  (or  F{x,y,z))  contain  no  bound  occurrences  of 
X  or  y  IS  not  really  restrictive.  In  any  special  case,  one  can  replace  some 
free  x's  of  an  arbitrary  Q  by  y'&  by  first  using  Thm.VI.6.8  to  replace  all 
bound  .t's  and  y's  by  other  letters,  then  using  Axiom  scheme  7A  to  replace 
the  free  x's  by  free  i/'s,  and  finally  using  Thm.VI.6.8  to  change  the  bound 
letters  back  again. 

The  familiar  wording,  "A  quantity  may  be  substituted  for  its  equal  in 
any  statement,"  as  a  paraphrase  for  Axiom  scheme  7 A  is  actually  quite 
incorrect  and  thoroughly  misleading.  As  we  saw  in  Chapter  III,  a  quantity 
cannot  appear  in  a  statement  about  the  quantity,  and  if  it  could  so  appear, 
it  would  be  ridiculous  to  speak  of  replacing  the  quantity  by  its  equal  be- 
cause its  equal  could  not  conceivably  be  anything  other  than  the  quantity 
itself.  Actually,  not  the  quantity  but  a  name  of  it  appears  in  any  statement 
about  it,  and  Axiom  scheme  7A  merely  says  that  we  may  replace  that 
name  by  any  other  name  of  the  same  quantity.  Needless  to  say,  if  we 
should  have  a  statement  about  the  name  rather  than  a  statement  containing 
the  name.  Axiom  scheme  7A  would  not  permit  replacement  by  some  other 
name  of  the  same  quantity. 

In  other  words,  one  is  to  put  "  =  "  between  two  names  if  and  only  if 
they  are  names  of  the  same  object.  The  resulting  statement  about  the 
object,  that  it  does  not  differ  from  itself  in  any  way,  is  of  an  excessive 
degree  of  triviality.  Nevertheless  the  statement  does  call  our  attention  to 
the  fact  of  the  two  names  being  names  of  the  same  object,  and  hence  justifies 
our  using  the  names  interchangeably  in  statements  about  the  object.  This 
is  the  sole  meaning  of  Axiom  scheme  7A. 

Thus  if  we  write  "f  =  A,"  we  indicate  that  ''3/4"  and  "9/12"  are  names 
of  the  same  number.  Then  if  we  have  any  statement  about  this  number 
containing  one  of  the  names,  say  "f  <  3,"  we  may  use  the  other  name  in 
this  statement,  to  wit:  "i%  <  3."  If,  on  the  other  hand,  we  have  a  state- 
ment about  one  of  the  names,  we  are  not  entitled  to  make  the  same  state- 
ment about  the  other  name.  Thus  "the  denominator  of  'f '  is  not  divisible 
by  3"  is  true  whereas  "the  denominator  of  '-^'  is  not  divisible  by  3"  is  false. 
Besides  Axiom  scheme  7A,  we  need: 

Axiom  scheme  7B.  Let  Xi,  x^,  .  .  .  ,  Xn,  x  be  variables,  not  necessarily 
distinct.    Then 

(Xi,   X2,    •    •    •    J   ^n)\^)    X    =    X 

is  an  axiom. 

The  property  x  =  a:  is  called  the  reflexive  property  of  equality. 

Later,  after  further  ideas  have  been  introduced,  we  shall  be  able  to  define 
equality  in  terms  of  these  further  ideas.  We  shall  then  be  able  to  replace 
Axiom  scheme  7 A  and  Axiom  scheme  7B  by  a  single  Axiom  scheme  7.    This 
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is  why  we  are  counting  Axiom  scheme  7A  and  7B  as  parts  of  one  axiom 
scheme  instead  of  as  two  axiom  schemes. 

We  now  prove  the  symmetric  property  of  equahty: 
^Theorem  VII.1.1.     \-  (x,y).x  =  y  D  y  =  x. 

Proof.  Take  P  to  be  2  =  x  in  Axiom  scheme  7A.  Then  Q  is  x  =  x  and 
R  is  y  =  X.  So  by  Axiom  scheme  7A,  [-x  =  y.  D.x  =  xDy  =  x.  So 
\-x  =  x.  D.x  =  yDy  =  x.    So  our  theorem  follows  by  Axiom  scheme  7B. 

By  our  convention  stated  at  the  end  of  Chapter  VI,  x  and  y  denote  dis- 
tinct variables  in  this  theorem.  A  similar  remark  applies  to  subsequent 
theorems. 

We  now  prove  the  transitive  property  of  equality: 
**Theorem  VII.1.2.     \-  (x,y,z):x  =  y.y  =  z.  D  .x  =  z. 

Proof,     By  Thm.VI.6.8,  it  suffices  to  prove 

|-  (w,x,y):w  =  x.x  =  y.  D  .w  =  y. 

Take  P  to  be  w  =  2  in  Axiom  scheme  7A.  Then  Q  is  w  =  x  and  R  is 
w  =  y.    So  by  Axiom  scheme  7A, 

Yx  =  y.Z^.w  =  xD'W  =  y. 

From  this,  our  theorem  follows. 

This  is  conmionly  stated  as  "Things  equal  to  the  same  thing  are  equal  to 
each  other."  However,  this  is  not  quite  an  accurate  rendering  of  the  mean- 
ing of  the  theorem.  A  more  accurate  version  would  be  "If  x  and  y  are 
names  of  the  same  object,  and  y  and  z  are  names  of  the  same  object,  then 
X  and  z  are  names  of  the  same  object." 

Theorem  VII.  1.3.     ^  {x,y):x  =  y.  D  .F(x)  D  F(y). 

Proof.  Let  the  x,  y,  and  z  of  Axiom  scheme  7A  be  taken  to  be  three  dis- 
tinct variables  u,  v,  w  which  do  not  occur  at  all  in  F(x)  or  F(y).  Then 
F(w),  F(u),  and  F(v)  play  the  role  of  P,  Q,  and  R  in  Axiom  scheme  7A. 
So  by  Axiom  scheme  7A, 

\-  {u,v):u  =  V.  D  .F(u)  D  F{v). 

From  this  our  theorem  follows  by  Thm.VI.6.8. 
^Theorem  VII.1.4.     If  x,  y,  z,  P,  Q,  and  R  are  as  in  Axiom  scheme  7A,  then 

h  {x,y):Q^R.  D  .x  9^  y. 

Proof.  This  follows  from  |-  (x,y):x  =  y.  D  .^(Q^R)  by  Thm.VI.6.1, 
Part  XLV.    In  effect  this  theorem  says 

[-  (x,y):F(x,y,x).^F(x,y,y).  D  .x  ^  y. 

This  theorem  is  fairly  important,  since  it  is  a  generalized  form  of  the 
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useful  result  that,  if  one  can  find  a  statement  which  is  true  for  x  but  false 
for  y,  then  x  ^  y. 
Theorem  VII.1.5. 

*L  h  {y):F{y).  ^  .iEx).x  =  y.F{x). 
m.  h  {y):F{y).  ^  .{x).x  =  y  D  F(x). 

Proof  of  Part  I.  By  Thm.VI.7.3,  ^y  =  y.F{y).  D  .(E'X).x  =  y.F{x). 
So  by  Axiom  scheme  7B, 

\-F{y).  D  .(Ex).x  =  y.F{x). 

Now  by  Thm.VII.1.3, 

h  {x):x  =  y.F(x).  D  .F{y). 

So  by  Thm.VI.6.6,  Part  VII 

\-  {Ex).x  =  y.Fix):  D  :F{y). 
To  prove  Part  II,  we  put  ^F{x)  for  F{x)  in  Part  I. 

EXERCISES 

VII.1.1.     Prove  \-  (x,y):x  =  y.  =  .y  =  x. 
VII.1.2.     Prove: 

(a)  1-  (x,y,z):.x  =  y:  D  :x  =  z.  ^  .y  =  z. 

(b)  \-  (x,y,z):.x  =  yi  D  ix  9^  z.  =  .y  ^  z. 

VII.1.3.     Prove  \-  {x,y):X  =  y.  D  .F{x)  =  F{y). 
VII.1.4.     Prove: 

(a)  \-  (y,z):.y  =  z-.  =  :(x).y  =  x  D  x  =  z. 

(b)  |-  (y,z):.y  =  z:  =  :{x):y  =  X.  ^  .X  =  z. 

VII.1.5.     Prove  \-  (x,y,z):x  =  y.y  9^  z.  D  .x  7^  z. 

VII.1.6.     Prove  \-  {x)(Ey).y  =  x. 

VII.1.7.     Let  a  and  /3  be  variables  subject  to  the  restriction  K{x) .    Prove : 

(a)  ^Kiy)F{y).^  .(Ea).a  =  y.F(a). 

(b)  h  (/3):i^(/3).  =  .{Ea).a  =  /3.F(a). 

(c)  h  K(y)  D  F(y).  ^  .(a).a  =  yD  F{a). 

(d)  h  (/3):/^(/3).  =  .(«).«  =  /3  3  F(a). 

2.  Enumerative  Quantifiers.  The  statement  (Ea:)  F(x)  is  interpreted  as 
meaning  that  there  is  at  least  one  x  such  that  F{x)  is  true.  We  shall  now 
show  how,  for  a  given  positive  integer  n,  we  can  write  formulas  whose 
interpretations  would  be: 
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(VI 1. 2.1)  "There  are  at  least  n  different  x's  such  that  F(x)." 
(VII. 2.2)  "There  are  at  most  n  different  x's  such  that  F(x)." 
(VII.2.3)      "There  are  exactly  n  different  x's  such  that  F{x)." 

These  are  not  used  sufficiently  commonly  to  warrant  a  special  notation 
for  them.  However,  throughout  the  next  few  paragraphs  we  shall  use  the 
temporary  notation  (E^"^a;)  F(x)  to  denote  (VII. 2.1).  We  easily  define 
(E^"*a;)  F(x)  for  successively  greater  n's  as  follows: 

(E^''x)  F{x)  =  (Ex)  F{x), 
(E'^'x)  F(x):  ^  :{Ex):F(x):(E'''y).x  9^  y.F{y), 
{E'^'x)  F(x):  ^  :(Ex):F(x):iE'''y).x  9^  y.F(y), 
(E'^'x)  F{x):  ^  :{Ex):F{x):(E'''y).x  9^  y.F{y), 

and  so  on. 

Clearly  ^(£'"""2;)  F{x)  will  denote  (VII.2.2),  and  {E'^^'x)  F(x). 
~(E'"^"a;)  F{x)  will  denote  (VII.2.3). 

Various  other  interesting  combinations  can  be  written.  Thus  (E'"^a;) 
~i^(a;).~(E^"'"^^a;)~F(a;)  states  that  F(x)  is  true  for  all  but  n  different  a:'s, 
and  {E^'"^x)  F(x).^(E^''^'^x)  F(x)  states  that  F(x)  is  true  for  at  least  m  but 
at  most  n  different  x's. 

Of  these  various  notions,  only  two  are  sufficiently  often  used  to  merit  a 
permanent  notation,  namely,  "F(x)  is  true  for  at  least  one  x,"  for  which  we 
have  the  notation  (Ex)  F(x),  and  "F(x)  is  true  for  exactly  one  x,"  for  which 
we  shall  adopt  the  notation  (Eia;)  F{x). 

We  shall  define  (Ei.r)  F(x)  as  the  simplest  of  several  statements  equiva- 
lent to  (E^^x)  F(x).^(E^^^x)  F{x).    In  unabbreviated  form,  this  is 

{Ex).F{x):^{Ex).F{x).{Ey).x  9^  y.F{y). 

We  shall  now  show  this  equivalent  to  four  other  statements,  the  last  (and 
simplest)  of  which  we  shall  take  as  the  definition  of  {E^x)  F{x). 
theorem  VII.2.1. 

h  {Ex).F{x):^{Ex),F{x),{Ey).x  9^  y.F{y):. 

.(Ex).F(x):(x,y):Fix).F(y).  D  .x  =  y.. 

.{Ex):F(x).(y).F(y)  D  x  =  y:. 

.(Ex)(y).x  =  y^  F(y):. 

^  :.(Ey)(x).y  =  x^  Fix). 

Proof.  Let  these  five  statements  be  denoted  hy  A,  B,  C,  D,  E  in  the 
order  named.  We  shall  prove  yA^B,yBD  C,  j-  C  =  2),  h  Z)  D  5,  and 
\-  D  =  E,  which  suffice  to  prove  our  theorem. 


168  LOGIC  FOR  MATHEMATICIANS  [Chap.  VII 

Proof  of\-A  =  B.     By  Thm.VI.6.6,  Part  VI, 

h  Fix).(Ey).x  9^  y.F(y):  ^  :(Ey).F{x).x  9^  y.F{y). 
So 

h  -(Ea:).F(x).(Ey).a:  5^  y.F{y):  ^  :-^(Ex,y),F{x).F{y).x  ^  y. 

By  the  duality  theorem 

h  '-^{Y.x,y).F{x).F{y).x  9^  y.:  =  :.{x,y):F{x).F{y).  D  .x  =  y. 

We  now  easily  infer  \-  A  =  B. 

Proof  of\-B  D  C.  We  first  show  B  \-c  C.  We  assume  B,  whence  we  get 
(E.r)  Fix)  and  {x,y):F{x).F{y).  D  .x  =  y.  So  by  rule  C  and  Thm.VI.5.1, 
F{x)  and  {y):F{x).F{y).  D  .x  ^  y.  So  (y):F(x).  D  .F{y)  D  x  =  y.  So 
F(x).  D  .(y)-^(y)  ^  X  =  y.  Then  by  F{x)  and  modus  ponens,  we  get 
{y).Fiij)  D  X  =  y.  Then  F{x).(y).F{y)  D  x  =  y.  Finally  by  Thm.VI.7.3, 
{Ex):F(x).(y).F{y)  D  x  =  y.  This  is  C.  We  can  replace  \-c  by  |-,  and  then 
\-B  D  C  follows. 

Proof  of\-C  =  D.     By  Thm.VII.1.5,  Part  II, 

\-F(x).  ^  .(y).x  =  y  D  F{y). 

So  by  the  equivalence  theorem 

h  C  ^  ■..{^x):.{y).x  =  yD  F(y):{y).F(y)  D  x  =  y. 

Then  by  Thm.VI.6.5,  Part  I,  we  get  h  C'  =  D. 

Proof  of\-D  D  B.     We  first  show  D  |-c  -B.    We  assume  D  and  get 

(1)  {y).x  =  y  ^  F{y) 

by  rule  C.    Then  x  =  x  ^  F{x)  by  Axiom  scheme  6,  and  so  F{x)  by  Axiom 
scheme  7B,  and  so 

(2)  (Ex)  F{x) 

by  Thm.VI.7.3.    Let  2  be  a  variable  different  from  x  and  y,  which  does  not 
appear  in  F{y).    Then  by  Axiom  scheme  6,  with  (1), 

(3)  x  =  y^F(y), 

(4)  X  =  z  =  F{z). 

However,  |- a;  =  z.x  =  y.  D  .z  =  y.  So  by  (3)  and  (4),  F(z)F(y)  D  z  =  y 
So  by  two  uses  of  rule  G,  (z,y):F(z).F(y).D  .z  =  y.  Then  by  Thm.VI.6.8, 
{x,y):F{x).F{y).  D  .x  =  y.    Finally  by  (2), 

{Ex).F{x):(x,y):F(x).Fiy).  D  .x  =  y. 

This  is  B.    We  can  replace  \-c  by  |-,  and  then  \-  D  D  B  follows. 
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Proof  of\-D^E.  Choose  z  a  variable  different  from  x  and  y  and  not 
appearing  in  F(x).    Then  by  Thm.VI.6.8, 

bZ)^  :{Ez)iy).z  =  y  ^  F(y): 

=  :(Ez){x).z  =  x^  F(x): 

=  :{Ey)(x).y  =  x^  F{x). 

This  \s\-D  ^  E. 

Note  that  m  A,B,  and  C,  each  of  F{x)  and  F(?/)  occur,  and  so  our  con- 
vention says  that  F{y)  is  the  result  of  replacing  all  free  occurrences  of  x  in 
F{x)  by  occurrences  of  y,  and  that  F{x)  is  the  result  of  replacing  all  free 
occurrences  of  y  in  F{y)  by  occurrences  of  x.  Since  only  F{y)  appears  in 
D  and  only  F{x)  appears  in  E,  it  is  perhaps  necessary  to  state  explicitly  that 
our  convention  applies  to  D  and  E  also.  In  particular,  in  E,  there  are  no 
free  occurrences  of  y  in  F{x).  Then  by  Thm.VI.6.8,  we  can  change  y  to 
any  other  variable  (except  x)  with  no  free  occurrences  in  F{x).  This  con- 
dition that  y  have  no  free  occurrences  in  F{x)  needs  to  be  stated  when  we 
give  the  definition  of  (Eix)  F{x). 

Definition.  If  2/  is  a  variable  distinct  from  x  with  no  free  occurrences 
in  P,  then  we  define  (Ei^;)  P  to  be  {Ey){x).y  =  a;  =  P. 

There  are  various  ways  of  expressing  (^ix)  Fix)  in  English.    For  instance  : 

^^F{x)  is  true  for  exactly  one  x." 

"F(x)  is  true  for  one  and  only  one  x." 

"For  precisely  one  x,  F{x)." 

"For  one  and  only  one  x,  F(x)." 

Statement  B  of  the  statements  proved  equivalent  in  Thm.VII.2.1  is 
particularly  suited  to  the  phraseology  "one  and  only  one,"  since  (Ex)  F{x) 
indicates  that  P(a;)  is  true  for  one  a;,  and  {x,y):F(x).F(y).  D  .x  =  y  indicates 
that  F(x)  is  true  for  only  one  x. 

Theorem  VII.2.2.     \-  (z)(Eix)  x  =  z. 

Proof.  By  Thm.VI.7.3,  \-  z  =  z.  D  .(Ey)  y  =  z.  So  by  Axiom  scheme 
7B,  h  (Ey)  y  =  z.  Then  by  Ex. VII. 1.4(b),  \-  (Ey)(x):y  =  x.  ^  .x  =  z. 
By  definition,  this  is  \-  (Eix)  x  =  z,  and  our  theorem  follows. 

There  is  a  question  as  to  what  (Eio)  F(a)  shall  denote  when  a  is  a  variable 
subject  to  the  restriction  K(a).  It  was  agreed  that  in  such  case  (a)  F{a) 
should  denote  (x).Kix)  D  F{x),  and  (Ea)  F(a)  should  denote  (Ex).K(x). 
F(x).  So,  by  referring  to  the  definition,  (E^a)  F{a)  denotes  (E/3)(q:). 
/3  =  a  =  F(a),  namely, 

(El3)(x):K{x).  D  .^  =  x  =  F(x). 

The  question  is  whether  we  should  put  a  restriction  on  /S,  and  if  so  what? 
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It  turns  out  that  we  should  put  the  same  restriction  on  j3  that  we  put  on  a. 
So  (Eict)  F{a)  denotes 

{'E.y),K{y):{x):K{x).  D  .y  =  x  ^  F{x). 

This  seems  awkward,  but  it  enables  us  to  prove  the  following  theorem, 
which  gives  us  a  much  simpler,  and  completely  natural,  interpretation  for 
(E,a)F(a). 

Theorem  VII.2.3.  If  we  are  using  restricted  quantification  with  a 
subject  to  the  restriction  K{a),  then 

[7  (Eia)  F{a):  ^  :(E,x).K{x).Fix). 

Proof.     We  use  two  lemmas. 
Lemma  A. 

K(y):{x):K(x).  D  .y  =  x  =  F{x)  \-  {x):y  =  x.  =  .K(x).F(x). 

Proof.     We  first  establish 

K(y):(x):K{x).  D  .y  =  x^  F(x),  y  =  x  \- K{x)  .F  {x) , 

by  noting  that  if  we  have  K{y)  and  y  =  x,  we  get  K{x^  by  Thm.VII.1.3, 
and  then  from  K{x),  y  =  x,  and  (x):K(x).  D  .y  =  x  =  F{x),  we  get  F(x). 
We  next  establish 

K{y):{x):K{x).  D  .y  =  x^  F(x),  K(x).F(x)\-y  =  x 

for  from  K{x),  F{x),  and  (x):K(x).  D  .y  —  x  =  F{x),  we  get  y  =  x.  From 
these  two  results,  our  lemma  follows. 
Lemma  B. 

{x):y  =  a:.  =  .K{x).F{x)  h  {x):K{x).  D  .y  =  x  =  F(x). 

Proof.     This  follows  easily  from  the  two  obvious  results 

y  =  x.^  .K{x).F{x),  K(x),  y  =  x^  F{x), 

y  =  x.^  .K{x).F{x),  K{x),  F(x)  \- y  =  x. 

We  now  proceed  with  the  proof  of  our  theorem.  First  assume  (Eio:)  F(a). 
Then  by  rule  C 

K{y);(x):K(x).  D  ,y  =  x^  F(x). 

So  by  Lemma  A 

(x):y  =  X.  =  .K{x).F(x). 

So  (Ey){x):y  =  x.  ^  .K(x).F(x).    This  is  (E,x).K{x).F(x). 
Second  assume  {Eix).K{x).F(x).    Then  by  rule  C 

U)  (x):y  =  x.^  .K{x).F{x). 
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So  by  Lemma  B, 
(2)  {x):K(x).  D  .y  =  X  =  F(x). 

Also,  from  (1)  by  Axiom  scheme  6,  we  get 

y  ^  y.  ^  .K{y).F{y). 

So  K{y)  by  Axiom  scheme  7B.    Hence  by  (2) 

K(ij):{x):K(x).  D  .y  =  x^  F(x). 
So 

(Ey):K{yUx):K{x).  D  .y  =  x  ^  F{x). 

This  is  (Eia)  F{a). 

By  making  several  uses  of  Thm.VI.8.1,  we  could  avoid  the  necessity  for 
using  lemmas  in  the  proof  above. 

Because  of  the  theorem  just  proved,  we  need  not  use  the  strict  interpre- 
tation for  (Eia)  F(a)  but  can  always  use  the  more  natural  interpretation 
(E,x).K{x).F(x). 

EXERCISES 

VII.2.1.     Let  a  and  /3  be  variables  subject  to  the  restriction  K{x).  Prove: 

\-  {Ea).F{a):^{Ea).F{a).{E^).a  9^  /3.F(/3):. 

.(Ea).F(a):(a,/3):F(Q:)./^(|S).   D   .a  =  j8:. 

.(Ea):F(«).(/3)./^(/3)   3  a  =  /3:. 

.(Ea)(/3).a  =  /3  ^  F(/3):. 

.(E^)(a)./3  =  a  =  F{cc). 

VIL2.2.  Let  x^  y,  and  z  be  distinct  variables,  of  which  y  and  z  do  not 
occur  in  P.    Prove: 

(a)  h  {E,x)  P.  D  :.^(Ey).Q.(x).y  =  x  ^  P:  D  :(Ey).^Q.(x).y  =  x  ^  P. 

(b)  h  (Ey).'^Q.(x).y  ^  x  =  P-.  D  :^(Ey).Q.ix).y  =  x  ^  P.  (Hint.    Use 
reductio  ad  absurdum.) 

(c)  h  {z)(Ey).Q.{x).y  =  x  ^  P:  ^  :(Ey).{z)Q.{x).y  =  x  ^   ^.     {Hint. 
Prove  the  lemma  \-  (z)(Ey).Q.{x).y  =  x  ^  P:  D  :(E,x)  P.) 

(d)  h  (Ey).Q.R.{x).y  =  x  ^  P:  ^  :iEy).Q.{x).y  =  x  ^  P:{Ey).R.ix). 
y  =  X  =  P. 

VII.2.3.  Prove  \- (x).P  ^  Q:  D  :(E,x)P  =  (E,x)Q. 

VII.2.4.  Prove  |-  {E^x)P  D  (Ex) P. 

Vn.2.6.  Prove  \-  {x,y):F{x).F{y).  D  .x  =  y.:  =  :.(x,y):F(x).x  9^  y.  D 
,'^Fiy). 
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3.  Applications.  We  can  prove  the  trivial  equality  x  =  x,  and  we  can 
derive  new  equalities  from  given  ones  by  the  commutative  and  transitive 
laws  of  equality.  However,  we  have  as  yet  no  direct  way  to  prove  any 
really  useful  equality,  though  we  shall  soon  add  axiom  schemes  which  will 
enable  us  to  prove  useful  equalities.  Meanwhile,  one  can  prove  equalities 
by  indirect  means.  Consider  the  following  proof  of  equality  by  reductio  ad 
absurdum. 

Theorem.     A  sequence  can  have  at  most  one  limit.    That  is,  if  lim  a„  =  a 

n— >co 

and  lim  a„  =  (3,  then  a  —  ^. 

Proof.  We  use  restricted  quantification,  with  m  and  n  denoting  positive 
integers  and  e  denoting  a  positive  real  number.  We  give  the  same  meaning 
to  ''n  e  PI"  and  "e  >  0"  as  before. 

We  assume  lim  a^  =  a,  namely, 

(1)  (e)(En){m),m  >  n  D  \  a„  —  a  \  <  e. 
We  also  assume  lim  a„  —  /3,  namely, 

(2)  (£)(En)(m).w  >  n  D  [  a„  -  /3  [  <  s. 

Since  we  intend  to  prove  a  =  /3  by  reductio  ad  absurdum,  we  also  assume 

(3)  a9^  0. 
From  this  we  get 

(4)  1  a  -  /3  I  >  0, 
whence  we  get 

(5)  i  I  «  -  /3  I  >  0. 
Then  by  (1)  and  (2) 

(6)  (En)(m).TO  >nD\a„-a\<h\a-0\, 

(7)  (En){m).m  >  n  D  \  a^  -  13  \  <  ^  \  a  -  ^\. 
So  by  rule  C, 

(8)  n  e  PI, 

(9)  (m).m  >  n  D  |a„-a|  <^la-^|, 

(10)  A^  e  PI, 

(11)  im).m  >ND\a^-l3\<U(x-0\- 
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By  (8)  and  (10), 

(12)  .  71  +  N  t  PI, 

(13)  n-h  N  >  n, 

(14)  n  +  N  >  N. 
Then  by  (12)  and  (9) 

n  -{-  N  >  n  D  \  a„+N  —  a  \  <  ^  \  a  —  (3  \, 
and  by  (12)  and  (11) 

n-i-  N  >  N  D  \  a„+^  -/?!  <  ^  \  a  -  ^  \. 
So  by  (13)  and  (14), 

(15)  I  ttn+N  -  (x\    <  ^  \  a  —  I3\, 

(16)  I  a„+;v  -/3|  <  h\a  -  ^\. 

From  (15)  and  (16)  by  use  of  some  theorems  of  mathematics,  we  get 

I  a  -  ;S  I  =  I  -{ttn+N  -  a)  +  (a„+M  -  ^)  I 

<  I  cin+N  —  a  I  +  I  a„+N-  —  /3  I 

<i|a-/3|+i|a-/3| 

=  I  a  -  /3  |. 
So 

(17)  |a-/3l<|a-/3|. 
However,  this  contradicts 

(18)  ~(|  a  -  iS  I   <  I  a  -  /3  1) 

which  follows  from  (4)  by  a  known  theorem  of  mathematics. 

Strictly  speaking,  our  theorem  should  have  the  additional  hypothesis 
that  we  are  operating  in  a  metric  space  and  should  read  "In  a  metric  space, 
a  sequence  can  have  at  most  one  limit."  The  reader  who  is  acquainted 
with  metric  spaces  will  recognize  that  this  hypothesis  justifies  the  step  from 
(3)  to  (4)  and  the  step  from  (15)  and  (16)  to  (17).  Needless  to  say,  it 
would  be  permissible  to  make  stronger  assumptions  such  as  that  a,  /3,  and 
all  the  a's  are  real  (or  complex)  numbers. 

There  is  a  subtle  point  in  connection  with  the  use  of  Axiom  scheme  7A 
which  is  overlooked  by  all  but  the  most  careful  writers.    Suppose  we  have 

lim  a„  =  a 
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and  (n).a„  =  &„.    At  first  sight  the  step  to 

lim  h„  =  a 

appears  to  be  a  simple  appHcation  of  Axiom  scheme  7A.  However,  the 
situation  is  more  complex  than  this.  By  repeated  applications  of  Axiom 
scheme  7 A,  we  can  replace  each  of  a  finite  number  of  quantities  by  its  equal. 
However,  in  the  present  situation,  we  have  an  infinite  number  of  quantities, 
a„,  each  to  be  replaced  by  its  equal.  This  cannot  be  justified  by  a  direct 
application  of  Axiom  scheme  7A. 

In  general,  each  situation  requiring  an  infinite  number  of  uses  of  Axiom 
scheme  7A  requires  individual  attention.  We  now  give  a  valid  proof  which 
covers  this  situation.  It  will  be  noted  that  the  method  of  procedure  is 
fairly  general  and  can  easily  be  modified  to  handle  other  cases.  Let  us  use 
restricted  quantification,  with  the  same  conventions  as  in  the  preceding 
proof. 

Assume 

(n)  a„  =  6„. 

Then  by  Axiom  scheme  6  -  " 

m  e  PI  D  a^  =  &„. 

So  by  Ex.VII.1.3, 

m  e  PI:  D  :.m  >  n  D  \  a„  —  a  \  <  e-.  ^  -.m  >  n  D  \  b^  —  a  \  <  e. 

Then  by  Thm.VI.6.1,  Part  XXVII, 

m  c  PI.  D  .m>  nD  |  a„  —  a  |  <  s:  =  -.me  PI.  D  .m  >  n  D  |  6„  —  a  |  <  e. 

So  by  three  uses  of  Thm.VI.4.2, 

(e,n,m):.w  e  PI.  D  .m  >  n  D   |  a„  —  «  |  <  s:  =  :m  6  PI.  D  .m  >  n  D 
I  6„  —  a  I   <  e, 

where  here  we  temporarily  suspend  our  convention  that  we  are  using  re- 
stricted quantification  with  e,  n,  and  m.    Then  by  Thm.VI.5.4, 

{z)(En){m).m  >  n  D  |  a„  —  a  |  <  e.-  =  :{z)(En)(ni).m  >  n  D  \b„  —  a  \  <  s, 

where  we  are  now  using  restricted  quantification. 
This  is  the  statement 

lim  a„  =  a.  =  .lim  6„  =  a, 

and  we  can  now  proceed  from  lim  a„  =  a  to  lim  6„  =  <x. 

The  transitive  property  of  equality  is  commonly  used  in  everyday  mathe- 
matics, where  it  is  commonly  (and  not  altogether  accurately)  paraphrased 
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as  "Things  equal  to  the  same  thing  are  equal  to  each  other."  For  instance, 
to  solve  the  two  simultaneous  equations  x  -\-  y  =  b  and  a;  +  2?/  =  7,  we 
reduce  them  iox  =  b  —  y  and  x  =  1  —  2y,  whence  we  get  b  —  y  =  7  —  2y 
by  the  transitive  law. 

More  commonly,  Axiom  scheme  7A  is  used  in  solving  simultaneous 
equations.  Thus,  given  x  -\-  y  =  b  and  xy  =  6,  we  get  x  —  Q/y  from  the 
second  and  "substitute"  into  the  first,  getting 

Q/y  +  y  =  5. 

This  substitution  is  merely  a  use  of  Axiom  scheme  7A.  To  see  this,  let 
F(z)  hez  +  y  =  5.  Then  F{x)  is  a:  +  ?/  =  5  and  F(6/y)  is  Q/y  +  y  =  5. 
Then  Axiom  scheme  7A  states 

x  =  6/y:  D  :x  +  y  ^  5.  D  .6/y  +  y  =  5. 

Another  instance  of  the  use  of  Axiom  scheme  7A  is  in  deriving  such 
principles  as 

(x,y,X,Y):x  =  y.X  =  Y.  D  .x  +  X  =  y  +  Y. 

This  is  commonly  (and  not  altogether  accurately)  paraphrased  as  "If 
equals  are  added  to  equals,  the  sums  are  equal."  This  is  derived  as  follows. 
First  take  F{x,y,z)  to  be  x  +  X  =  2  +  X  in  Axiom  scheme  7A,  and  infer 

(1)  X  =  y:  D  :x  +  X  =  X  +  X.  D  .x  +  X  =  y  ^-  X. 

Now  take  F(X,Y,Z)  to  he  x  -}-  X  =  y  -\-  Z  in  Axiom  scheme  7 A,  and 
infer 

(2)  X  =  Y:  D  :x  +  X  =  y  -{-  X.  D  .X  +  X  =  y  +  Y. 

Now  by  Axiom  scheme  7B,  we  have  x  -\-  X  =  x  -{-  X.    So  by  (1),  we  get 

X  =  y.  D  .X  +  X  =  y  -\-  X. 

Combining  this  with  (2)  gives 

X  =  y.X  =  Y.  D  .X  +  X  ==  y  +  Y. 

Obviously  this  proof  makes  use  of  no  properties  of  +  except  that  x  -\-  y 
is  an  object.    A  precisely  similar  proof  would  show 

(x,y,X,Y):x  =  y.X  =  Y.  D  .xX  =  yY, 

(x,y,X,Y):x  =  y.X  =  Y.  D  .X'  =  Y", 

(x,y,X,Y):x  =  y.X  =  Y.  D  .log.  X  =  log„  Y, 

etc. 
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We  have  had  an  instance  of  the  use  of  Thm.VII.1.4  on  page  154.  We 
had  proved 

(16)  \y  -  z\  <\x  -  y\. 
We  take  Thm.VII.1.4  in  the  form 

1-  {z,x):F(z,x,z).^F(z,x,x).  D  .z  9^  X. 
We  take  F(z,x,z)  to  be  (16),  and  then  F{z,x,x)  is 

\  y  -  x\  <  \  X  -  y\. 
However,  a  theorem  of  mathematics  states 

^(\  y  -  x\  <  \  X  -  y\). 
So  we  have  F(z,x,z)  and  '^F{z,x,x),  and  hence  infer 

(17)  Z  9^  X. 

In  connection  with  (Eix)  P  and  related  matters,  we  should  like  to  con- 
sider two  postulates  and  three  theorems  of  plane  geometry,  as  given  by 
Wentworth  and  Smith.  We  shall  use  restricted  quantification,  letting 
capital  roman  letters  denote  points,  and  small  greek  letters  denote  straight 
lines.  Further,  we  shall  adopt  the  following  shorthand  notations  for 
geometric  ideas: 

a  _L  /3 a  is  perpendicular  to  jS, 

a  \\  13  a  is  parallel  to  /3, 

^  on  q; A  is  on  a, 

A  on  a a  passes  through  A. 

On  page  23,  Wentworth  and  Smith  state  Postulate  1  :^ 

"One  straight  line  and  only  one  can  be  drawn  through  two  given  points." 

In  symbols: 

(A,B):A  9^  B.  D  .(Eia).A  on  a.B  on  a. 

On  page  46  is  given  the  definition  of  parallel  lines  :^ 
"Lines  that  lie  in  the  same  plane  and  cannot  meet  however  far  produced 
are  called  parallel  lines." 
In  symbols: 

a  \\  13:  =  :a  and  13  are  in  the  same  plane: '^(E A). A  on  a.A  on  /3. 

Immediately  below  the  definition  of  parallel  lines  is  given  the  parallel 
postulate:^ 

1  From  "Plane  and  Solid  Geometry"  by  George  Wentworth  and  D.  E.  Smith, 
copyright  1913^  courtesy  of  Ginn  &  Company. 
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"Through  a  given  point  only  one  hne  can  be  drawn  parallel  to  a  given 
line." 

We  recall  that  (Ex)  F(x)  means  that  there  is  at  least  one  x  such  that  F(x), 
and  {x,y):F{x).F{y).  D  .x  —  y  means  that  there  is  only  one  x  such  that 
F(x).    So  we  can  render  the  parallel  postulate  into  symbols  as  follows: 

(A,y){a,^):a  \\  y.A  on  q:./3  ||  y.A  on  (3.  D  .a  =  /3. 

Now  let  us  look  at  Proposition  VIII  on  page  39:^ 

"Only  one  perpendicular  can  be  drawn  to  a  given  line  from  a  given 
external  point." 
In  symbols: 

{A,y):.^(A   on  7):    D    :(a,^):a    J_    y.A   on  a./S    _L    y.A   on  (3.    J    .a    =--    /3. 

Since,  by  truth  values  \- PQ  D  R.  =  .P'-^R  D  ^Q  (the  P,  Q,  and  R  here 
denote  statements,  and  not  points),  an  alternative  version  of  this  is 

{A,y):.^{A  on  7):    D    :(a,l3):a  J_  y.A  on  a.A  on  /3.q:   5^   13.    3,~(|8   i.   7). 

It  is  in  this  form  that  Wentworth  and  Smith  prove  the  proposition.  We 
give  a  condensed  version  of  their  proof  (see  Fig.  VII. 3.1).^ 

Given:  A  line  XY,  P  an  external 
point,  PO  a  perpendicular  to  XF  from 
P,  and  PZ  any  other  line  from  P  to 
XY. 

To  prove :  That  PZ  is  not  ±  to  X  Y. 
(The  reader  will  note  that  our  A  is 
their  P,  our  7  is  their  XY,  our  a  is 
their  PO,  and  our  13  is  their  PZ.  Then, 
in  our  notation,  they  are  assuming 
'^(A  on  7),  a  ±  y,  A  on  a,  A  on  /3, 
and  a  9^  ^,  and  are  undertaking  to 
prove  '^(;S  ±  7).) 

Proof.     Produce  PO  to  P',  making 
Fig.  VII.  3.1.  OP'  equal  PO,   and    draw    P'Z.     By 

construction,  POP'  is  a  straight  line, 
and  so  by  Postulate  1,  PZP'  is  not  a  straight  line.  Hence  Z  P'ZP  is  not 
a  straight  angle.  Now  PO  =  P'O,  OZ  =  OZ,  and  Z  POZ  =  Z  P'OZ 
(since  each  is  a  right  angle) .  Hence  the  triangles  POZ  and  P'OZ  are  con- 
gruent. So  Z  OZP  =  Z  OZP',  so  that  Z  OZP  is  half  of  Z  PZP'.  As 
Z  PZP'  is  not  a  straight  angle,  Z  OZP  is  not  a  right  angle,  and  so  PZ 
is  not  ±  to  XF. 


/6zd. 


178  LOGIC  FOR  MATHEMATICIANS  [Chap.  VII 

We  should  like  to  pay  special  attention  to  the  use  of  Postulate  1  in  this 
proof,  since  it  involves  uses  of  Axiom  scheme  7A.  If  we  put  in  the  details 
of  the  use  of  Postulate  1,  they  might  run  as  follows:  Suppose  PZP'  is  a 
straight  line.  Then  by  Postulate  1,  PZP'  =  POP'.  But  Z  on  PZP'.  So 
by  Axiom  scheme  7A,  Z  on  POP'.  Then  Z  =  0,  for  ii  Z  9^  0,  then  POP' 
and  XY  would  coincide  by  Postulate  1,  since  each  passes  through  the  dis- 
tinct points  Z  and  0.  However,  \i  Z  =  0,  then  PZ  =  PO  (apply  Axiom 
scheme  7A  to  the  statement  PZ  =  PZ),  contrary  to  assumption. 

This  completes  the  proof  of  Proposition  VIII.  We  turn  now  to  Propo- 
sition XIV  on  page  46:^ 

"Two  lines  in  the  same  plane  perpendicular  to  the  same  line  cannot  meet 
however  far  they  are  produced." 

Actually,  the  proposition  as  stated  is  false,  as  one  can  see  by  considering 
in  3-space  the  configuration  consisting  of  three  mutually  perpendicular 
lines  through  a  common  point.  An  accurate  statement  of  the  proposition 
would  be: 

"If  each  of  two  lines  is  perpendicular  to  a  third,  and  there  is  a  single 
plane  in  which  all  three  lines  lie,  then  the  two  original  lines  cannot  meet 
however  far  they  are  produced." 

Actually,  since  at  this  point  Wentworth  and  Smith  have  not  yet  embarked 
on  the  study  of  solid  geometry,  it  seems  most  appropriate  to  assume  for 
all  statements  in  this  portion  of  the  book  that  they  apply  only  to  figures 
which  can  be  embedded  in  a  plane.  If  this  is  not  assumed,  then  the  state- 
ment,^ "From  a  given  point  in  a  given  line  only  one  perpendicular  can  be 
drawn  to  the  line,"  which  appears  on  page  23  of  Wentworth  and  Smith  is 
clearly  false. 

So  we  make  the  assumption  that  we  are  considering  only  figures  in  a 
plane,  and  simplify  Proposition  XIV  to : 

"Two  lines  perpendicular  to  the  same  line  cannot  meet  however  far  they 
are  produced." 

In  symbols: 

{a,^,y):a  9^  /8.a  X  7./?  ±  7.  ^  .~(E^).A  on  a.A  on  /?. 

The  proof  is  by  reductio  ad  absurdum.^  We  assume  a  5^  /3.a:  ±  7./5  X  7 
and  (EA).A  on  a.A  on  /3  and  try  to  derive  a  contradiction.  By  rule  C,  we 
get  A  on  a.A  on  /3.  Now  if  A  on  7,  we  get  a  contradiction  by  the  result 
which  we  quoted  from  page  23  of  Wentworth  and  Smith  (Wentworth  and 
Smith  overlook  this  possibility),  and  if  ~(A  on  7)  we  get  a  contradiction 
by  Proposition  VIII  (see  above). 

Incidentally,  the  above  proof  is  an  illustration  of  the  advantage  of 
formalizing  statements  and  proofs.     With  the  statements  and  proofs  in 
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words,  it  is  very  easy  to  overlook  the  case,  A  on  7,  but  when  we  are  using 
symbols  it  would  be  very  hard  to  overlook  it. 

We  come  now  to  the  third  (and  most  interesting)  proposition,  namely. 
Proposition  XV  on  page  47:^ 

"If  a  line  is  perpendicular  to  one  of  two  parallel  lines,  it  is  perpendicular 
to  the  other  also." 

This  proposition  would  be  false  if  we  do  not  have  the  implicit  assumption 
that  the  entire  figure  lies  in  a  plane.    However,  we  are  assuming  this. 

In  symbols,  the  proposition  would  read 

(a,0,y):a  9^  ^.a  ||  /3.7  ±  a.  D  .y  ±  13. 

Actually,  the  condition  a  9^  13  is  unnecessary,  since  Wentworth  and  Smith 
have  framed  their  definition  of  a  1 1  /?  so  that  one  can  prove 

h  a  1 1  j8  D  a  F^  /3. 

So  we  shall  prove  the  theorem  in  the  (apparently)  stronger  form 

{a,l3,y):a  \\  I3.y  ±  a.  D  .7  J.  /3, 

using  a  proof  adapted  from  Wentworth  and  Smith  (see  Fig.  VII. 3. 2).^ 


: 

A 

— _____ 

B 

^"■*^< 

Fig.  VII.3.2. 

We  assume  all/?  and  7  JL  a,  and  the  entire  figure  lying  in  some  plane. 
By  the  definition  of  perpendicularity,  a  and  7  have  a  point  in  common. 
By  rule  C,  call  it  A.  Now  7  must  intersect  (8,  else  (since  7  and  jS  are  co- 
planar)  we  would  have  7  1 1  /3,  whence  we  would  get  a  =  7  by  the  parallel 
postulate,  and  this  is  impossible  since  7  _L  a.  By  rule  C,  we  can  let  B  be 
the  intersection  of  /3  and  7.  Through  B  we  draw  5  (in  the  plane  of  our  figure) 
with  5  _L  7.  Then  h  9^  a,  for  if  5  =  a,  then  5  on  a  by  Axiom  scheme  7A, 
so  that  a  and  ^  meet  in  B,  contradicting  a\\  ^.  (Wentworth  and  Smith 
overlook  the  necessity  for  proving  5  9^  a.)    Then  5  and  a  never  meet,  by 

'-  lUd. 
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Proposition  XIV  (see  above),  since  8  9^  a.b  A.  y.a  ±  7.  So,  since  5  and  a 
are  coplanar  by  construction,  we  have  5  ||  a.  Now,  with  5  and  /3  both 
parallel  to  a  and  both  passing  through  B,  we  get  5  =  /3  by  the  parallel 
postulate.  We  now  come  to  the  most  interesting  step  of  the  whole  proof. 
We  have  just  shown  8  =  ^.  Also  we  have  7  .1  5  by  construction.  So 
7  ±  (8  by  Axiom  scheme  7A. 

EXERCISES 

VII.3.1.  In  the  three  geometric  proofs  which  have  just  been  given, 
point  out  uses  of  the  logical  principles  listed  in  Sec.  5  of  Chapter  II. 

VII.3.2.  Find  an  illustration  in  the  mathematical  literature  of  the  use  of 
Axiom  scheme  7A. 


CHAPTER  VIII 
DESCRIPTIONS 

1.  Axioms  for  Descriptions.  In  Chapter  III,  and  again  in  Chapter  VII, 
when  we  spoke  of  names,  we  were  using  the  term  "name"  in  its  most  general 
sense  as  any  word  or  symbol  or  arrangement  of  words  and  symbols  which 
signifies  some  object  and  is  used  in  statements  to  refer  to  that  object.  In 
general,  there  are  many  names  for  a  given  object.  This  is  familiar  in  the 
case  of  names  of  persons.  Thus  we  might  find  "Jack"  and  "Mr.  John 
Murgatroyd  Smith"  as  names  for  the  same  person.  In  a  certain  legal 
document  "the  party  of  the  first  part"  may  serve  as  an  additional  name 
for  this  same  person.  In  army  records  he  would  be  referred  to  by  a  number. 
On  a  ticket  for  overparking  which  a  policeman  leaves  attached  to  the  steer- 
ing wheel  of  his  car,  he  would  be  referred  to  as  the  "owner  of  a  Nash  4-door 
sedan,  registration  TP  899,  year  1949." 

Of  these  various  names,  the  last  differs  from  the  rest  in  one  important 
particular,  in  that  it  identifies  the  person  completely,  even  to  someone  with 
no  previous  acquaintance  with  him.  The  other  names  are  names  of  this 
particular  person  only  by  agreement  or  usage,  and  if  a  newcomer  is  unaware 
of  the  agreement  or  usage  he  would  have  no  way  of  identifying  the  person 
from  his  name. 

A  similar  situation  occurs  in  mathematics.  By  agreement  and  usage, 
"e"  is  the  name  of  a  certain  number. 

lim  (l  +  -)" 

is  another  name  of  the  same  number.  However,  the  second  name  identifies 
the  number  unequivocally,  even  to  one  who  has  never  before  heard  of  it, 
and  so  the  second  name  could  never  become  the  name  of  a  different  number. 
On  the  other  hand,  a  change  in  usage  would  suffice  to  make  "e"  the  name 
of  quite  a  different  number. 

We  shall  use  the  word  "description"  to  indicate  a  name  which  by  its  own 
structure  unequivocally  identifies  the  object  of  which  it  is  a  name.  It  is 
not  our  intention  to  pursue  further  the  distinction  between  definitions  and 
more  ordinary  names,  or  to  analyze  the  mechanism  by  which  a  name  can  be 
associated  with  an  object  if  the  name  is  not  a  description  of  the  object.  We 
proceed  now  to  a  formal  treatment  of  descriptions. 

181 
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The  most  commonly  occurring  form  of  description  is  constructed  as 
follows.  One  finds  somehow  a  statement  F{x)  which  is  true  exactly  when  x 
is  the  object  in  question.  We  then  take  as  a  description  of  our  object  "the 
X  such  that  F{x)." 

This  is  sj^mbolized  hy  "lx  F(x)",  which  is  read  "the  x  such  that  F{xy\ 
and  which  is  called  a  description. 

In  connection  with  this  notation,  the  following  point  rises  immediately. 
If  (Eia;)  F{x),  then  ix  F(x)  is  a  name  of  the  unique  object  which  makes  F{x) 
true.  However,  suppose  '^(Ejo:)  F(x),  so  that  there  is  no  unique  object 
which  makes  F(x)  true.  Shall  we  permit  lx  F(x)  in  such  cases,  and  what 
meaning  shall  we  assign  to  it? 

It  will  be  recalled  that  we  have  not  committed  ourselves  to  using  only 
formulas  which  have  meaning.  So  we  agree  to  use  lx  F(x)  in  the  case  of 
any  F{x),  and  in  case  '■^(Eix)  F(x),  we  shall  simply  consider  lx  F(x)  as  a 
meaningless  formula.  In  fact,  we  shall  not  even  require  that  F(x)  contain  x. 
Given  an  arbitrary  statement  P,  we  shall  admit  la;  P  as  a  formula.  This  is 
to  be  a  name  and  may  properly  occur  as  a  constituent  of  statements,  even 
though  it  may  fail  to  be  a  name  of  anything. 

If  the  reader  is  unhappy  about  using  formulas  without  meaning,  he  can 
arrange  for  lx  P  to  have  a  meaning,  regardless  of  what  x  and  P  are,  by 
adopting  the  following  convention.  Choose  some  arbitrary,  fixed  object, 
say  the  number  r,  and  agree  that,  if  (Ei^;)  P,  then  ta:  P  is  a  name  of  the 
unique  x  such  that  P  is  true,  and  if  '-"-'(Eix)  P,  then  lx  P  is  a  name  of  the 
number  tt.  Then  for  each  x  and  P,  ix  P  is  a  name  of  a  unique  object.  Our 
axioms  for  "l"  will  be  in  agreement  with  such  an  interpretation. 

Whitehead  and  Russell  showed  that  any  statement  containing  t's  is 
equivalent  to  a  much  more  elaborate  statement  without  I's.  Hence  they 
dispensed  with  statements  containing  I's,  using  instead  the  very  elaborate 
equivalent  statements.  By  this  means  they  did  not  use  t's  at  all.  Following 
their  lead,  many  logicians  do  likewise.  This  has  the  advantage  of  saving 
one  symbol,  l,  and  four  axiom  schemes.  However,  even  the  simplest  state- 
ments of  mathematics  are  transformed  into  very  complicated  circumlocu- 
tions by  this  procedure,  and  so  we  do  not  take  advantage  of  it,  despite  its 
considerable  usefulness  in  many  logical  studies.  In  a  mathematical  develop- 
ment, such  as  we  are  pursuing  here,  it  is  far  more  convenient  to  use  t's,  and 
we  shall  do  so. 

Notice  that  in  lx  F{x)  it  is  quite  immaterial  what  variable  we  use  for  x. 
Clearly  Ly  F{y)  would  denote  the  same  object.  So  in  lx  F(x),  all  occurrences 
of  X  are  bound.  However,  if  ?/  is  a  variable  different  from  x,  then  any 
occurrences  of  y  which  are  free  in  F(x)  are  also  free  in  lx  F(x). 

If  F(x)  contains  no  free  occurrences  of  any  variable  except  x,  then 
lx  F(x)  contains  no  free  occurrences  of  any  variable,  and  so  must  be  a 
constant.    Examples  are 
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tx  (x  €  R.x  -\-  X  =  x), 
tx  (x  e  R:{y).y  e  R  D  yx  =  y), 

IX  (x  e  /?:.(£):.£  >  0:   D  :(Ey):y  €  R:(z).Z  >  ?/  D   |  (1  +  l/z)'  -  x\   <  e), 

where  "x  e  R"  denotes  "x  is  a  real  number."    The  three  formulas  just  writ- 
ten denote  0,  1,  and  e,  respectively. 

If  F(x)  contains  free  occurrences  of  x  and  y  only  (where  x  and  y  are  dis- 
tinct variables),  then  i.x  F{x)  contains  free  occurrences  of  y  only,  and  is  a 
function  of  y.    Thus 

LX  {x  e  R.y  e  R.x  +  ?/  =  0), 

LX  {x  i  R.y  €  R.xy  =  1) 

denote  —y  and  \/y,  respectively. 

Similarly,  if  F{x)  contains  free  occurrences  of  x,  y,  and  z,  then  ix  F(x) 
is  a  function  of  y  and  z. 

All  constants  and  functions  of  mathematics  will  arise  in  this  manner. 
We  now  have  two  ways  to  bind  the  occurrences  of  x  in  P,  namely,  to 
form  (x)  P  and  ix  P.  Correspondingly,  we  now  have  more  ways  to  produce 
confusion  of  bound  variables.  Even  if  our  definition  of  confusion  of  bound 
variables  (page  94)  is  not  changed,  it  now  covers  more  situations.  Thus 
ii  Pisx  =  Ly  (y  =  x),  and  we  replace  all  free  occurrences  of  x  by  occurrences 
of  y,  we  get  y  =  ly  {y  =  y),  which  is  quite  a  different  statement  about  y. 
However,  this  is  because  we  made  a  replacement  which  caused  confusion 
of  bound  variables,  in  that  the  rightmost  occurrence  of  a;  in  P  is  free,  but 
when  we  replace  it  by  an  occurrence  of  y,  the  resulting  occurrence  of  y  is 
bound. 

We  extend  our  conventions  about  F{x,y)  and  F{y,y)  and  F{x)  and  F{y) 
(seepage  95)  to  cover  the  extension  in  the  meaning  of  "confusion  of  bound 
variables"  caused  by  the  introduction  of  a  new  means  of  binding  variables. 
We  must  now  take  account  of  one  essentially  new  method  of  causing 
confusion  of  bound  variables.  This  arises  from  the  possibility  of  replacing 
free  occurrences  of  a:  in  P  by  occurrences  of  lz  Q  rather  than  by  occurrences 
of  another  variable  y.  If  we  have  a  statement,  for  instance,  (E^)  y  9^  x, 
about  a  variable  x,  we  may  wish  to  make  the  corresponding  statement  about 
a  name  iz  Q.  In  general,  we  would  do  so  by  replacing  the  free  occurrences 
of  X  by  occurrences  of  iz  Q;  thus  in  the  present  case  we  would  write 
(Ey)  y  9^  LZ  Q.  However,  in  certain  circumstances  this  is  not  the  corre- 
sponding statement  about  lz  Q.  Thus  li  Qisz  =  y,  then  lz  Q  is  lz  (z  ^  y), 
which  is  merely  another  name  for  the  quantity  of  which  y  is  &  name.  So, 
whereas  (Ey)  y  ^  x  is  true  for  every  x,  (Ey)  y  9^  lz  (z  =  y)  is  the  same  as 
(Ey)  y  9^  y,  and  is  false,  and  besides  is  not  strictly  speaking  a  statement 
about  LZ  (z  =  y). 
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We  see  that,  if  we  try  to  substitute  lz  (z  =  y)  for  x,  in  the  above  instance, 
we  get  the  same  sort  of  confusion  of  bound  variables  that  we  would  get  if 
we  try  to  substitute  y  for  x,  and  for  the  same  reason. 

Accordingly,  we  do  have  to  generalize  our  definition  of  confusion  of 
bound  variables.    We  take  the  new  definition  as  follows. 

Let  A  denote  either  a  variable  or  a  description.  Let  P  be  a  statement  and 
Q  be  the  result  of  replacing  each  free  occurrence  of  x  (if  any)  in  P  by  an 
occurrence  of  A.  Consider  each  variable  y  which  has  free  occurrences  in  A. 
If  some  bound  occurrence  oiy'inQ  is  one  of  the  free  occurrences  of  y  in  an 
occurrence  of  ^  in  Q  which  is  the  result  of  replacing  a  free  occurrence  of  x 
in  P  by  an  occurrence  of  A,  then  we  say  that  the  replacement  causes  con- 
fusion.   Otherwise,  we  say  that  the  replacement  causes  no  confusion. 

To  understand  this  definition,  the  reader  must  realize  that  occurrences 
of  a  variable  may  be  free  in  a  part  of  a  statement  and  yet  be  bound  in  the 
entire  statement.  Thus  in  (E?/)  y  9^  iz  {z  =  y),  the  rightmost  occurrence 
of  y  is  free  in  lz  (z  =  y),  but  bound  in  the  entire  statement. 

As  our  notations  F(x,y)  and  F(y,y)  and  F(x)  and  F(y)  refer  to  replacing 
occurrences  of  a  variable  by  occurrences  of  a  variable,  we  can  continue  to 
use  them  with  the  same  meaning  as  before.  However,  we  wish  to  supple- 
ment  them  by  a  new  notation. 

In  case  we  refer  to  two  statements  as  F(x)  and  F(Ly  Q),  where  x  and  y  are 
variables  which  may  or  may  not  be  distinct,  it  shall  be  assumed  that 
P(iy  Q)  is  the  result  of  replacing  each  free  occurrence  of  x  (if  any)  in  F(x) 
by  an  occurrence  of  ly  Q,  and  this  replacement  causes  no  confusion.  It  is 
not  assumed  that  there  are  not  occurrences  of  ly  Q  in  F{x),  nor  is  it  assumed 
that  there  are.  It  is  not  assumed  that  there  are  free  occurrences  of  x  in 
F(x).  No  assumption  is  made  as  to  the  occurrence  or  nonoccurrence  of 
variables  and  descriptions  which  have  not  been  exphcitly  mentioned. 

This  new  notation  is  not  strictly  analogous  to  our  earher  notations.  To 
be  strictly  analogous,  we  should  write  F(x,Ly  Q)  and  F{Ly  Q,Ly  Q).  How- 
ever, we  have  preferred  a  condensed  notation  to  one  which  preserves  a 
strict  analogy. 

We  shall  now  state  our  axiom  schemes  for  i,  using  the  conventions  which 
we  have  been  discussing  to  avoid  explicit  mention  of  necessary  assumptions 
about  the  absence  of  confusion  of  bound  variables. 

Axiom  scheme  8.  Let  Xi,  X2,  .  .  .  ,  x^,  x,  y  be  variables,  not  necessarily 
distinct.    Let  F{x),  F^iy  Q),  and  Q  be  statements.    Then 

(x,,  x,,...,  xM^)  Fix).  D  .F{iy  Q) 

is  an  axiom. 

Axiom  scheme  9.  Let  Xi,  X2,  .  .  .  ,  Xn,  x  be  variables,  not  necessarily 
distinct.    Let  P  and  Q  be  statements.    Then 
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(xi,  X2,  .  .  .  ,  Xn):(x).P  =  Q.  D  .IX  P  —  ix  Q 

is  an  axiom. 

Axiom  scheme  10.  Let  Xi,  X2,  .  .  .  ,  x^,  x,  y  be  variables,  not  necessarily 
distinct.    Let  F{x)  and  F{y)  be  statements.    Then 

(a^i,  X2,  .  .  .  ,  Xn).ix  F{x)  =  Ly  F(y) 
is  an  axiom. 

Axiom  scheme  IL  Let  Xi,  X2,  .  .  .  ,  x„,  x  he  variables,  not  necessarily 
distinct.    Let  P  be  a  statement.    Then 

(xi,  X2,  .  .  .  ,  x„):.(Eix)  P;  D  :{x):lx  P  =  x.  =  .P 

is  an  axiom. 

The  first  of  these  says  for  descriptions  what  Axiom  scheme  6  says  for 
variables.  Taken  together,  Axiom  scheme  6  and  Axiom  scheme  8  say  that, 
if  A  is  an  object  (that  is,  a  variable  or  a  description),  then  (x)  F(x)  D  F{A). 
Since  we  place  no  restrictions  on  i,y  Q  in  Axiom  scheme  8,  this  means  that 
we  intend  to  treat  Ly  Q  as  an  object  even  in  the  situation  where  '^(Eiy)  Q, 
and  where  ly  Q  has  no  meaning.  The  reader  who  objects  to  this  can  inter- 
pret I?/  Q  as  a  name  for  t  in  all  such  cases. 

Axiom  scheme  9  says  that,  if  P  and  Q  are  equivalent  for  all  x,  then  lx  P 
and  ix  Q  are  names  of  the  same  object.  The  question  of  what  sense  this 
makes  if  '~(Eix)  P  can  be  raised  as  for  Axiom  scheme  8,  and  answered 
similarly. 

Axiom  scheme  10  enables  us  to  change  bound  variables  to  other  bound 
variables  as  long  as  no  confusion  of  bound  variables  is  caused.  Because 
of  this,  we  can  extend  the  convention  of  Sec.  11  of  Chapter  VI  to  the 
following.  If  we  write  a  formula  (x)  P  or  lx  P,  then  if  any  variables  occur 
bound  within  P  they  shall  be  distinct  from  x  unless  we  specifically  indicate 
otherwise. 

In  effect.  Axiom  scheme  11  says  that,  if  (Eja;)  P,  then  lx  P  is  the  unique  x 
which  makes  P  true.  To  see  that  this  is  the  case,  write  Axiom  scheme  11 
in  the  form 

(E,x)  F(x):  D  :ix):LX  F(x)  =  X.  =  F(x), 

which  we  change  to  the  equivalent  form 

(E,x)  F(x):  D  :(y):LX  F{x)  =  y.  ^  F{y). 

Now  assume  (Ei^;)  F{x).  Then  {y)'.LX  F{x)  =  ?/.  —  F{y).  If  there  is  any 
confusion  of  bound  variables  in  F{lx  F{x)),  we  can  change  the  bound  vari- 
ables of  F(y)  by  Thm.VI.6.8  or  Axiom  scheme  10,  so  that  there  is  no  con- 
fusion of  bound  variables  in  F{lx  F(x)).    Then  by  Axiom  scheme  8, 

IX  F{x)  =  IX  F{x).  =  F(lx  F{x)). 
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So  by  Thm.VIII.2.1  below, 

F(lxF(x)). 

So  lx  F{x)  is  one  of  the  x's  which  make  F(x)  true.  We  now  show  that  it  is 
the  only  such  x.  For  suppose  F(z).  By  Axiom  scheme  6,  lx  F(x)  =  z. 
=  F(z),  whence  F(z).  D  .lx  F{x)  =  z,  and  so  2  =  lx  F(x). 

The  reader  will  note  that  we  now  have  essentially  three  ways  of  using 
variables  (letters,  that  is).  They  may  occur  free,  or  they  may  occur  bound 
by  (x)  or  they  may  occur  bound  by  lx.  These  three  uses  are  analogous  to 
familiar  usages  in  everyday  mathematics,  to  wit: 

A  free  use  of  x,  say  in  F(x),  is  analogous  to  an  unknown  as  in 

a:'  -  4x  +  3  =  0. 

A  use  of  X  bound  by  (x),  say  in  (x)  F(x),  is  analogous  to  one  kind  of 
variable  as  in 

a:'  -  1  =  (.T  +  l)(.r  -  1). 

A  use  of  x  bound  by  lx,  say  in  lx  F{x),  is  analogous  to  another  kind  of 
variable  as  in 


•^0 


x^  dx. 


It  will  turn  out  that  these  three  uses  of  variables  (letters)  suffice  for  all 
mathematical  statements.  Indeed,  as  shown  by  Whitehead  and  Russell, 
one  can  dispense  with  lx,  but  only  at  the  expense  of  using  very  elaborate 
circumlocutions. 

EXERCISES 

VIII.1.1.  Let  "x  e  Nn"  denote  "x  is  a  nonnegative  integer"  and  let 
+  and  X  have  their  customary  numerical  significance.  Using  "a:  e  Nn", 
+ ,  X ,  and  logical  symbols  write  formulas  for : 

(a)  The  greatest  prime  factor  of  n. 

(b)  The  least  common  multiple  of  m  and  n.  _ 

(c)  The  greatest  integer  less  than  or  equal  to  -y/n,  when  n  is  a  positive 
integer. 

(d)  The  integer  quotient  got  by  dividing  m  by  n  when  m  and  n  are  positive 
integers. 

(e)  The  remainder  left  after  dividing  w  by  n  when  m  and  n  are  positive 
integers. 

(f)  n  -  1. 

VIII.1.2.     Let  "x  e  i2"  denote  "x  is  a  real  number"  and  let  +  and 
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X  have  their  customary  numerical  significance.    Using  "x  e  R,"  +,  X, 
and  logical  symbols  write  formulas  for: 

(a)  I  • 

(b)  +  v^. 

(c)  \  X  -  y  \. 

2.  Definition  by  Cases.     We  first  establish  a  few  properties  of  t. 

Theorem  VIII.2.1.     ^  ix  P  =  lx  P. 

Proof.  By  Axiom  scheme  S,\-  (x)  x  =  x.  D  .ix  P  =  ix  P.  So  our  theo- 
rem follows  by  Axiom  scheme  7B. 

Theorem  VIII.2.2.  If  F{x)  is  a  statement  such  that  there  is  no  confusion 
of  bound  variables  in  F{lx  F{x)),  then  \-  (E,x)  F{x)  D  F{u  F{x)). 

Proof.  By  Axiom  scheme  8,  |-  (x):lx  F(x)  =  x.  ^  F(x):  D  \ix  F(x)  = 
ixF(x).  =  F(lxF(x)). 

So,  by  Thm. VIII. 2.1  and  the  statement  calculus, 

[-  (x):lx  F(x)  =  X.  =  Fix):  D  -.F^LX  F(x)). 

Our  theorem  now  follows  by  Axiom  scheme  11. 

Theorem  VIII.2.3.     \- iy).LX  (x  =  y)  =  y. 

Note  that  our  conventions  require  that  x  and  y  be  distinct  variables. 

Proof.  Choose  y  a  variable  distinct  from  x  and  take  F(x)  to  be  a:  =  y. 
Then  by  Thm.VII.2.2,  h  (Eix)  F(x),  and  so  by  Thm.VIII.2.2,  \-  F{ix  F{x)). 
That  is,  \-  LX  (x  =  y)  =  y. 

The  question  now  arises  whether  we  can  prove  \-  y  =  ix  {x  =  y).  One 
is  tempted  to  try  to  use  Axiom  scheme  8  to  replace  x  by  ix  {x  =  y)  in 
(-  {x,y).x  =  y  D  y  =  X,  but  unfortunately  the  prohibition  against  confusion 
of  bound  variables  in  Axiom  scheme  8  prevents  this.  To  take  care  of  this 
sort  of  difficulty,  we  prove  the  following  theorem,  in  which  it  is  permitted 
that  there  be  free  occurrences  of  y  in  lx  P. 

Theorem  VIII.2.4. 

\-  (y):LX  P  =  y.  =  .y  =  LX  p. 
Proof.     Choose  a  z  which  does  not  occur  in  lx  P.    Then  by  Ex. VII.  1.1, 

|-  (x,z):X  =  Z.   =   .Z  =  X. 

So  by  Axiom  scheme  8, 

[-  (z):lX  P  =  Z.  =  .Z  =   IX  P. 
Then  by  Axiom  scheme  6, 

\-  LX  P  =  y.  =  .y  =  LX  P. 
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In  a  similar  manner  we  prove : 
Theorem  VIII.2.5. 

I.  \-  (y,z):iz  P  =  y.y  =  z.  D  .lx  P  =  z. 

II.  \-  {x,z):x  =  ly  Q.iy  Q  =  z.  D  .x  =  z. 

III.  \-  (z):ix  P  =  ly  Q.iy  Q  =  z.  D  .ix  P  —  z. 

IV.  h  ly):LX  P  =  y.y  =  tz  R.  D  .lx  P  =  iz  R. 

If  we  are  going  to  be  able  to  substitute  ix  (x  =  y)  for  y  or  vice  versa,  we 
need  a  similar  generalized  form  of  Axiom  scheme  7A,  which  we  now  prove. 

Theorem  VIII.2.6.  Let  F{x,y)  be  a  statement,  and  let  F{ix  P,y)  and 
F(y,y)  be  the  results  of  replacing  all  free  occurrences  of  x  in  F(x,y)  by 
occurrences  of  lx  P  and  y,  respectively,  and  suppose  these  replacements 
cause  no  confusion  of  bound  variables.    Then 

h  (y):LX  P  =  y.D  .F{lx  P,y)  =  F(y,y). 

Proof.  Let  2  be  a  variable  which  does  not  occur  in  F{x,y)  or  lx  P,  and 
let  F{x,z)  and  F{z,z)  denote  the  results  of  replacing  all  free  occurrences  of  y 
in  F{x,y)  and  F(y,y)  by  occurrences  of  z.  Also  let  F(lx  P,z)  be  the  result 
of  replacing  all  free  occurrences  of  x  in  F{x,z)  by  occurrences  of  lx  P. 
Clearly  there  is  no  confusion  of  bound  variables  in  F{lx  P,z)  since  there  is 
none  in  F{lx  P,y).    By  Axiom  scheme  7A, 

h  {^,z)ix  =  z.  D  .F{x,z)  D  F{z,z). 

So  by  Axiom  scheme  8 

[-  {z):lx  P  =  z.  D  .F{lx  P,z)  D  F{z,z). 

So  by  Axiom  scheme  6 

^LxP  =  y.D  .F{LxP,y)  D  F{y,y). 

By  Thm.VII.1.1  and  Axiom  scheme  7 A,  we  have 

\-  {x,z):x  =  z.D  .F{z,z)  D  F{x,z), 

So,  by  Axiom  scheme  8  and  Axiom  scheme  6, 

\-ixP  =  y.-D  .F(y,y)  D  F(lx  P,y). 
Then 

^LxP  =  y.D  .F{lx  P,y)  ^  F{y,y) 

and  our  theorem  follows. 

**Theorem  VIII.2.7.     [-  F{Ly  P)  D  (Ea:)  F{x). 

Proof.     Analogous  to  the  proof  of  Thm.VI.7.3. 

The  reader  should  note  carefully  our  conventions  about  the  use  of 
F{Ly  P)  and  F{x),  which  are  such  that  an  instance  of  this  theorem  would  be 
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^  lyP  =  ty  P.  3  .(Ex).x  =  ly  P 

if  there  are  no  free  occurrences  of  x  in  P. 

We  are  now  about  ready  to  prove  the  theorem  which  permits  us  to  make 
a  definition  by  cases.  Such  definitions  are  very  common  in  mathematics, 
and  we  cite  two  famihar  cases: 

+  1         if  X  is  rational 


fix)  = 


if  X  is  iiTational 


Note  that  these  definitions  do  not  define  f(x)  for  each  x;  both  leave 
/(-\/~l)  undefined.  One  could  always  add  an  additional  clause  defining 
f(x)  in  all  circumstances  not  already  covered,  but  it  is  useful  not  to  have  to 
do  so.  The  typical  circumstance  for  a  definition  by  cases  is  that  we  have 
some  mutually  exclusive  conditions,  for  instance,  "x  <  0",  "x  —  0", 
''x  >  0",  and  we  wish  to  define  f(x)  for  each  x  covered  by  one  of  our  con- 
ditions, and  with  different  definitions  according  to  which  condition  x 
satisfies. 

A  condition  on  x  will  in  general  be  a  statement  involving  x.  The  assei  - 
tion  that  two  conditions  P.-  and  P,-  be  mutually  exclusive  is  that  '^(P.P,). 
The  assertion  that  several  conditions,  Pj,  Pg,  .  .  .  ,  P„  be  mutually  exclusive 
is  the  logical  product  of  all  statements  '^(PiP,)  with  I  <  i  <  j  <  n.  So 
the  first  sentence  of  each  of  the  next  two  theorems  merely  defines  Q  as  the 
assertion  that  the  conditions  P  are  mutually  exclusive. 

Theorem  VIII.2.8.     Let  P^,  P2,  .  .  .  ,  Pn  be  statements  and  let  Q  be  the 
logical  product  of  all  statements  '^(P.-P,)  with  1  ^  i  <  j  <  n.    Let  y  be  a 
variable.    For  each  i,  1  <  i  <  n,  let  Ai  be  a  variable  different  from  y  or  a 
description  not  containing  free  occurrences  of  y.    Then  for  1  <  k  <  n: 
I.  h  iy).QP,:  3  :i'E,y):y  =  A,.P,.y.y  =  A,.P,.y.  •  ■  ■  .v.y  =  ^„.P„. 
n.  h  (y).QP.:   3   :^y  (y  =  A,.P,.y.y  =  A,.P,.y.  ■  ■  •  .y.y  =  A„.P„)   =  A,. 

Proof.     By  truth  values 

h  QP,:  D  iR^P^yR^P^y  •  •  •  vPnP„.  ^  .P*. 
So,  if  we  take  Ri  to  he  y  =  Ai  and  take  F(y)  to  be 

y  =  A,.Pi.y.y  =  A^.Pz.y.  •  •  •  .w.y  =  A„.P^, 
we  have 

h  QP.:  D  :F(y).  ^  .y  =  A,. 
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So  by  rule  G  and  Axiom  scheme  4,  we  get 

\-  iy).QP,:  D  :{y):F(y).  ^  .y  =  A,. 
Now  by  Ex.VII.2.3, 

h  iy):F{y).  ^  .y  =  A,.:  D  :.(E,y)  F(y).  ^  .(^,y)  y  =  A,. 

If  At  is  a  variable,  then  we  get  |-  (Ei^)  y  =  Ak  from  Thm.VII.2.2  by  Axiom 
scheme  6.  If  Ak  is  a  description,  then  we  get  \-  (Eii/)  y  =  Ak  from  Thm. 
VII. 2. 2  by  Axiom  scheme  8.  In  either  case,  we  get  |-  (Ei?/)  y  =  Ak,  and  so 
infer 

h  {y):F{y).  ^  .y  =  Ak.-.  D  -..(E^y)  F{y). 
So 

\r  {y).QPk:  D  :(E,?/)  F(y), 

which  is  Part  I  of  our  theorem.    By  Axiom  scheme  9, 

h  (y):F(y).  =  .y  =  Ak.-.  D  -..ly  F{y)  =  ly  {y  =^  Ak). 

However,  by  Thm. VIII. 2.3  and  whichever  of  Axiom  schemes  6  or  8  is 
appropriate,  we  have  \-  ly  {y  =  Ak)  =  Ak.  So  we  can  combine  the  various 
results  given  above  and  get 

h  {y).QPk'.  3  ny  F{y)  =  Ak, 

which  is  Part  II  of  our  theorem. 

The  theorem  which  we  have  just  proved  is  a  little  too  general  for  most 
purposes.  In  general,  if  we  are  making  a  definition  by  cases,  the  P's  will 
be  conditions  on  x,  and  so  must  contain  free  occurrences  of  x,  but  we  shall 
be  able  to  take  y  to  be  a  variable  which  does  not  occur  in  the  P's.  This 
gives  us  a  special  case  of  the  above  theorem,  which  special  case  is  the  one 
in  common  use. 

**Theorem  VIII.2.9.  Let  P^,  P2,  .  .  .  ,  Pn  be  statements  and  let  Q  be  the 
logical  product  of  all  statements  ^{PiPj)  with  I  <  i  <  j  <  n.  Let  ?/  be  a 
variable  not  occurring  in  any  of  the  P's.  For  each  i,  \  <  i  <  n,  let  .4,  be  a 
variable  different  from  y  or  a  description  not  containing  free  occurrences 
of  y.    Then: 

I.  h  QCPivP^v   •••    vPJ:    3    :(E^y):y   =   Ar.P,.y.y   =   A,.P,.y.   •••    .v. 
y  =  A^.P^. 

Also,  for  1  <  fc  <  n, 

II.  [-QPki  D  :iy{y  =  A,.P,.y,.y  =  A2.P2.V.  •••  .y.y  =  A„.P„)  =  Ak. 

Proof.  Let  F{y)  be  as  in  the  preceding  proof.  Then  by  the  preceding 
theorem, 
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I-  (y).QP,:  D  :{E,y)  F(y),     ■ 
h  {y).QP,,  D  :cy  F{y)  =  A,. 

Since  there  are  no  occurrences  of  y  in  any  P.,  there  are  no  occurrences 
of  y  in  QPk.    So  by  Thm.VI.6.6,  Part  I, 

h  {y).QP>.:  -  :QP>.- 
Hence 

hQP.:  D   :(E,?/)  P(2/), 

h  QP.:  D  u2/  P(2/)  =  A,. 

The  second  of  these  is  Part  II  of  our  theorem.  From  the  first,  we  get 
for  fc  =  1,  2,  .  .  .  ,  n 

\-P,D  :Q.  D  .i^.y)F{y) 

by  Thm.VI.6.1,  Part  LXIIL    Then  by  repeated  apphcations  of  Ex.IV.4.6, 
Part  (d),  we  get 

hPivP.v  •••  vP„:  D  :Q.  D  .(E,2/)  P(?/). 
So 

hQCAvP^v  •••  vP„).  D  .{E,y)F(y). 

Let  us  now  see  how  we  can  use  this  theorem  to  define  the  functions  dis- 
cussed earher. 

We  can  take  Pi  to  be  "x  is  rational"  and  Pa  to  be  "x  is  irrational".  Then 
Q  is  '^(PiPz),  and  we  have  |-  Q.  Also  we  have  [-  a;  is  real.  D  .PivPa.  So 
by  the  theorem  just  proved 

[-  a:  is  real.  D  .(Eiy):y  =  0.x  is  rational. v.y  =  l.a;  is  irrational, 

|-  a;  is  rational.  D  .ly  {y  =  0.x  is  rational. v.?/  =  l.a:  is  irrational)  =  0, 

[-  a:  is  irrational.  D  .uy  (y  =  0.x  is  rational. v.t/  =  l.a;  is  irrational)  =  1. 

So  if  we  take  f(x)  to  be  Ly  (y  =  0.x  is  rational. v.?/  =  l.a;  is  irrational),  we 
have 

[-  a;  is  rational.  D  .fix)  =0, 

[- a;  is  irrational.  D  .f(x)  =  1- 

Similarly,  to  define  the  other  function  mentioned,  we  take  Pi,  P2,  and  P3 
to  be  "x  <  0",  "x  =  0",  and  "x  >  0".  In  this  case  Q  is  ~(PiP2)~(PiP3) 
'^(PzPs).    By  theorems  of  mathematics 

|-  a:  is  real.  3  .PivPavPs. 
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So  by  Part  I  of  our  theorem 

j-  a:  is  real:  D  ^.(E■^y)ly  —  —l.x<  O.y.y  =  0.x  =  O.v.y  =  1.x  >  0. 

Also,  if  we  take  f(x)  to  he  i.y  (y  =  —l.x<  O.y.y  =  0.x  =  O.v.y  =  1. 
a;  >  0),  then  by  Part  II  of  our  theorem  we  have 

ho:  <  0.  D  .fix)  =  -1, 

\-x  =  0.D  .fix)  =  0, 

[-x>  0.  D  .fix)  =  1. 

Notice  that  it  is  permitted  that  the  Ai  contain  free  occurrences  of  x. 
Thus,  let  Pi  be  X  <  0,  P2  be  X  >  0,  ^1  be  —x,  A^  be  x.  Then  if  we  define 
fix)  to  be  ly  iy  =  —x.x  <  O.y.y  =  x.x  >  0),  we  have 

yx<0.  D  .fix)  =  -X, 

\-x>O.D  .fix)  =  X. 

For  another  example  of  definition  by  cases,  see  Titchmarsh,  1939,  Sec. 
1.63,  page  31. 

We  now  raise  the  question  of  the  interpretation  of  la  P(a)  when  a  is 
restricted  to  the  range  Kia).  The  obvious  interpretation  would  be 
IX  iKix).Fix)).  However,  in  case  '^(Eiix).Kix).Fix),  this  definition  would 
not  permit  us  to  prove  [-  Kiia  Fia)).  For  this  reason,  we  adopt  a  definition 
by  cases.  First  of  all  we  choose  some  fixed  obj ect  A  satisfying  our  restriction 
Kix),  so  that  we  have  [-  KiA).  Now  we  define  la  Fia)  to  be  ix  iKix).Fix)) 
in  case  iEix).Kix).Fix),  and  to  be  A  in  case  '^(Eix).Kix).Fix).  That  is, 
we  define  la  Fia)  to  be 

iyiy=ix  iKix).Fix)):iE,x).Kix).Fix).:y:.y  =  A.^iE^x).Kix).Fix)), 

being  careful  to  choose  y  a  variable  that  does  not  occur  in  A  or  Kix)Fix). 
Theorem  VIII.2.10.     If  a  is  subject  to  the  restriction  Kia)  and  A  is  the 
fixed  object  chosen  for  use  in  defining  La  Fia),  then: 

I.  h  (Eia)  Fia).  D  .la  Fia)  =  ix  iKix)Fix)). 
II.  \-  -'(Eia)  Fia).  D   .la  Fia)  =  A. 

Proof.  This  follows  from  Thm. VIII. 2.9,  Part  II,  and  the  definition  of 
la  Fia),  since  by  Thm.VII.2.3, 

[-  (E,a)  Fia):  ^  :iE,x).Kix).Fix). 

We  now  wish  to  prove  |-  Kiia  Fia)).  However,  we  can  prove  this  only 
when  there  is  no  confusion  of  bound  variables  in  Kiia  Fia)),  and  indeed  it 
does  not  mean  what  we  intend  in  case  there  is  confusion  of  bound  variables. 
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We  shall  prove  instead  |-  (Ez).lcx  F(a)  —  z.K(z),  where  z  does  not  occur  in 
la  F{a).  Then,  if  there  is  no  confusion  of  bound  variables  in  K^ta  F{a)), 
we  would  have 

h  K{l(X  F{a)):  ^  :{Ez).La  F(a)   =  z.K(z) 

by  Thm.VII.1.5,  Part  I,  and  Axiom  scheme  8.  So,  since  we  can  always 
choose  z  so  as  to  avoid  confusion  of  bound  variables,  (Ez).La  F{a)  =  z.K(z) 
serves  as  a  substitute  for  K(La  F(a)). 

Theorem  VIII.2.11.  If  a  is  subject  to  the  restriction  K{a)  and  z  does 
not  occur  in  ta  F{a),  then  [-  (Ez).La  F{a)  =  z.K{z). 

Proof.  By  |-  la  F(a)  =  la  F(a),  we  get  \-  (Ez).La  F(a)  =  z  by  Thm. 
VIII.2.7.  So  by  rule  C,  la  F{a)  =  z.  We  now  proceed  by  cases  (see  6  in 
Sec.  5  of  Chapter  II). 

Case  1.  (Eia)  F(a).  Then  ta  F{a)  =  ix  (K{x)F(x))  by  Thm.VIII.2.10, 
Part  I.  So  LX  (K(x)F(x))  =  z.  Now  from  (E^a)  F{a),  we  get  (E^x)  K{x) 
F{x)  by  Thm.VII.2.3,  and  so  get  {x):ix  {K{x)F{x))  =  x.  =  .K(x)F{x)  by 
Axiom  scheme  11.  So  lx  (K(x)F(x))  =  z.  =  .Kiz).F{z).  So  K{z),  and 
hence  la  F{a)  =  z.K(z),  and  so  (Ez).io!.  F(a)  =  z.K{z). 

Case  2.  ^(E,a)  F(a).  Then  ta  F(a)  =  A  by  Thm.VIII.2.10,  Part  II. 
So  tI  =  z.  However,  we  chose  A  so  that  \-  K{A),  and  so  K{z).  Then 
iEz).iaF(a)  =  z.K(z). 

Corollary.  If  a  is  subject  to  the  restriction  K(a)  and  there  is  no  con- 
fusion of  bound  variables  in  K(La  F(a)),  then  \-  K(ia  F(a)). 

Theorem  VIII.2.12.     If  a  and  /?  are  subject  to  the  restriction  K{a),  then: 

I.  h  («)  F(a).  D  .F(c^  G(/3)). 

II.  h  (c,).F(a)  =  G(a):  D  iia  F(a)  =  la  G(a). 

III.  [-  iaF(a)  =  ipF(l3). 

IV.  If  there  are  no  free  occurrences  of  x  in  la  F(a),  then  \-  (Eja)  F(a)  :  D  : 

(x):La  F(a)  =  X.  =  .K{x).F{x). 
V.  If  there  are  no  free  occurrences  of  jS  in  la  F(a),  then  |-  (Eio)  F(a):  D  : 
(^):iaFia)   =  ^.  ^  .F(^). 

Proof  of  I.  By  Thm. VIII. 2. 11  and  rule  C,  i/3  G(^)  =  z.K(z).  By  Axiom 
scheme  6,  |-  (a)  F(a):  D  :K(z)  D  F{z).  So  (a)  F{a).  D  .F{z),  and  from  this 
by  3  =  1/3  G(/3)  we  get  {a)  F{a).  D  .F(t/3  (?(/3)). 

Proof  of  II.  (a).F(a)  =  G{a)  is  {x):K(x).  D  .F{x)  =  G(x),  from  which 
we  quickly  get  {x):K(x).F{x).  =  .K(x).G(x). 

Case  1.  (Eia)  F(a).  That  is,  (E,x).K(x).F(x).  Then  by  Ex. VII. 2.3, 
(E,x).K(x).G(x).  So  by  Thm.VIII.2.10,  Part  I,  la  F(a)  =  uc  (K(x).F{x)) 
and  la  G(q;)  =  lx  (K(x).G(x)).  However,  by  Axiom  scheme  9,  lx  (K{x). 
F{x))  =  LX  {K{x).G(x)).     So  La  F(a)  =  La  G(a). 

Case  2.     ~rEia)  F(a).     Then  ^(E,a)  G{a)  by  Ex.VII.2.3.     So  by 
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Thm.VIII.2.10,  Part  II,  la  F(a)  =  A  and  la  G{a)  =  A.  So  ta  F{a)  = 
la  G{a). 

Proof  of  III.  There  is  really  nothing  to  prove,  since  in  general  the  vari- 
able a  does  not  actually  occur  in  the  expanded  form  of  La  F(a).  However, 
to  justify  using  for  la  F(a)  a  formula  in  which  some  other  variable  occurs 
bound,  we  need  to  know  that  (aside  from  avoiding  confusion  of  bound 
variables)  the  choice  of  this  other  variable  is  immaterial.  On  this  point 
we  are  assured  by  Axiom  scheme  10. 

Proof  of  IV.  Assume  (E,a).F(a).  Then  by  Thm.VIII.2.10,  la  F{a)  = 
ix  {K{x).F{x)).  However,  by  Axiom  scheme  11  we  have  {x):lx  (K(x).F{x)) 
=  X.  =  .K{x).F(x).    So  we  get  {x):La  F(a)  ^  x.  =  .K{x)F{x). 

Proof  of  Y.  Assume  (Eia).F(a).  Then  by  Part  IV,  {x):ia  F(a)  =  x.  ^  . 
K(x)F(x).    So  (x):.K{x):  D  :LaF{a)  =  X.  =  .F{x).    That  is,  {^):LaF{a)  = 

Note  that  in  I,  III,  and  V  it  is  necessary  that  a  and  /3  be  subject  to  the 
same  restriction.  In  the  case  of  I  and  V,  this  is  clear  from  the  proof,  and 
in  the  case  of  III  it  turns  out  that  la  F{a)  and  i/3  F{^)  are  abbreviations  of 
quite  different  statements  if  a  and  (8  are  subject  to  different  restrictions. 

EXERCISES 
VIII.2.1.     Prove: 

(a)  \-  lx  P  =  ly  Q.  ^  .ly  Q  =  ix  P. 

(b)  [-  LX  P  =  ly  Q.iy  Q  =  lz  R.  D  ax  P  =  iz  R. 

(c)  h  F(ix  P).  -  .(Ex).x  =  IX  P.F(x). 

(d)  h  F{iX  P):  =  :{x):X  ^   lX  P.   D   .F(x). 

VIII.2.2.     Prove  that,  if  y  does  not  occur  free  in  i.x  P,  then 

VF{lxP).  ^  .(Ey).y  =  ixP.Fiy), 

and  explain  where  the  proof  would  break  down  if  there  were  free  occur- 
rences of  ?/  in  IX  P. 

VIII.2.3.  Taking  F{y)  to  he  y  =  y,  find  a  mathematical  statement  P 
such  that  (E,x)  P  is  true  and  FiiX  P).  =  .(Ey).y  =  tx  P.F{y)  is  false. 

VIII.2.4.  If  F{x),  F{lx  P),  and  F(Ly  Q)  are  interpreted  by  our  conven- 
tions, prove  \-  LX  P  =  ly  Q.  D  .F(lx  P)  -  F{Ly  Q). 

VIII.2.5.     Prove 

h  {y)--(x).y  =  X  ^  P.  D  .y  =  LX  P. 

VIII.2.6.     Prove  that,  if  there  are  no  free  occurrences  of  y  in  P,  then 

h  (Exx)  P:  D  :F(lx  P).  =  .(^y).F(y).(x).y  =  x  ^  P. 
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VIII.2.7.  In  the  preceding  exercise,  indicate  why  none  of  our  conven- 
tions ensures  that  there  are  no  free  occurrences  of  y  in  P,  and  show  where 
the  proof  breaks  down  if  we  permit  P  to  contain  some  free  occurrences  of  y. 

VIII.2.8.  Prove  that,  if  a  and  ^  are  subject  to  the  restriction  K(a),  then 
\-  (^).La  («  =  /3)  ==  13.  State  where  the  proof  would  break  down  if  a  and  /3 
were  not  subject  to  the  same  restriction. 

VIII.2.9.     Prove  that,  if  a  is  subject  to  the  restriction  K{a),  then 

^F(LaGia)).  D  .(Ex).K(x).Fix). 

VIII.2.10.  Prove  that  in  Thm. VIII. 2.9,  if  we  replace  y  by  a  variable 
a  subject  to  the  restriction  K{a),  the  conclusions  still  hold  if  we  insert  the 
additional  hypothesis  K(A,).K(A2).  •  •  •  .K(A„)  into  each  conclusion. 

VIII.2.11.  Let  "x  e  R"  denote  "x  is  a  real  number"  and  let  <  have  its 
customary  numerical  significance.  Using  "x  e  R'\  <,  and  logical  symbols 
write  a  formula  for  "the  least  of  x,  y,  and  z." 

VIII.2.12.  Supply  the  conditions  on  free  and  bound  variables  needed  to 
make  the  following  statement  valid,  and  prove  the  resulting  valid  state- 
ment. 

\-  {x,y).F,{x,y)  ^  F,{x,y).:  3  ■..{z).G,{z)  ^  G,{z),  D  ny  G,(lx  F,{x,y))  = 
ly  GziLX  F2ix,y)). 

3.  Uses  of  Descriptions  in  Everyday  Mathematics.  The  uses  of  de- 
scriptions in  everyday  mathematics  are  so  numerous  that  we  shall  just 
point  to  a  few  instances  to  indicate  this  widespread  use.  Some  notion  of 
the  extensive  use  of  descriptions  can  be  got  from  the  fact  that  "the"  is  the 
most  common  word  in  the  English  language  and  that  most  uses  of  "the" 
occur  in  descriptions.  When  we  speak  of  "the  line  through  two  points," 
"the  set  of  all  primes,"  "the  derivative  of  f{x),"  etc.,  we  are  using  descrip- 
tions. All  particular  constants  or  functions  of  mathematics  are  given  by 
descriptions. 

In  proving  theorems  about  descriptions,  most  of  the  results  do  not  follow 
from  any  special  axioms  about  descriptions,  but  from  the  description  itself. 
In  all  practical  cases  where  we  are  dealing  with  lx  F{x),  we  can  prove 
(Eix)  F{x),  and  hence  infer  F{lx  F(x))  (unless  there  is  confusion  of  bound 
variables,  which  there  never  is  in  any  practical  case),  and  practically  all 
theorems  about  lx  F(x)  are  derived  from  the  result  F(i.x  F{x)).  We  occa- 
sionally use  Axiom  scheme  9  to  prove  important  equalities  between  different 
descriptions,  and  indeed  some  of  the  most  useful  equalities  in  mathematics 
come  from  Axiom  scheme  9.  A  typical  use  of  Axiom  scheme  9  occurred  in 
the  proof  of  Thm. VIII. 2.8. 

We  have  encountered  descriptions  many  times  already  in  various  of  our 
examples.     Since  we  had  not  yet  explained  the  theory  of  descriptions. 
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we  tried  to  find  examples  with  as  few  descriptions  as  possible,  and  when  we 
could  not  avoid  descriptions,  we  treated  them  as  part  of  the  mathematics. 
Let  us  now  go  back  and  clear  up  the  use  of  descriptions  in  one  of  our 
examples.  On  pages  136  to  139  we  formalized  Bocher's  proof  that  the  sum 
of  two  continuous  functions  is  continuous.  The  first  use  of  a  description 
occurs  in  the  notation  f(x).  We  cannot  now  give  any  further  details,  but 
the  entire  question  of  the  treatment  of  functions  will  be  taken  up  later  and 
will  be  seen  to  depend  greatly  on  the  use  of  descriptions.  Our  next  descrip- 
tion is  the  quantity  e/2.  This  might  be  defined  as  tw  {w  >  O.w  -\-  w  =  e)  or 
in  various  other  ways  according  to  how  many  of  the  symbols  +,  X ,  2,  etc., 
are  then  available.  However  it  is  defined,  it  is  a  description,  and  our  use 
of  Axiom  scheme  6  on  page  137  must  really  be  a  use  of  Axiom  scheme  8. 
On  page  138,  our  definition  of  8  as  the  smaller  of  3i  and  ^2  is  a  definition 
by  cases.    That  is,  we  take  8  to  be 

IW  (W   =    81.81    <    82.S.W   =    82.81   >    82). 

Then  by  Thm. VIII. 2.9,  Part  II,  we  get 

\-  81  <  82.  D  .8  =  81. 
\-  81  >  82.  D  .8  ^  82. 

From  these,  we  can  infer  5  >  0,  5  <  81,  8  <  Sz  as  follows.  Since  5i  >  0 
and  82  >  0,  we  have  81  <  SjvSi  >  ^2  by  a  theorem  of  mathematics.  We 
now  give  a  proof  by  cases  (see  5  of  Sec.  5  of  Chapter  II). 

Case  1.  81  <  82.  Then  5  =  5i  so  that  5  <  5i  and  8  <  82  and  5  >  0 
(since  81  >  0). 

Case  2.  81  >  82.  Then  8  =  82  so  that  8  <  82  and  8  <  81  (giving 
d  <  81)  and  5  >  0. 

Since  5  is  a  description,  our  reference  to  Thm. VI. 7.3  on  page  139  should 
actually  be  a  reference  to  Thm.VIII.2.7. 


CHAPTER  IX 
CLASS  MEMBERSHIP 

1.  The  Notion  of  a  Class.  The  notion  of  a  class,  or  set,  or  aggregate,  or 
ensemble  is  familiar  and  widely  used.  The  theory  of  point  sets  has  received 
especial  attention,  but  many  other  types  of  classes  are  of  common  occur- 
rence, for  instance,  the  class  of  differentiable  functions,  the  set  of  prime 
numbers,  the  cosets  of  a  subgroup,  etc.  Some  sets  are  ordered,  for  instance, 
the  Fourier  coefficients  of  a  periodic  integrable  function.  We  shall  later 
consider  ordered  sets  but  for  the  present  shall  confine  our  attention  to 
unordered  sets.  For  us,  the  word  "class"  or  "set"  will  always  mean  an 
unordered  class  or  set,  unless  explicitly  stated  otherwise.  We  make  no 
distinction  at  all  between  "class"  and  "set." 

We  shall  use  "e"  for  the  relation  of  membership  between  an  object  and  a 
class,  so  that  "x  e  a"  shall  denote  that  the  object  denoted  by  "x"  is  in  the 
class  denoted  by  "a."  That  is,  a;  is  a  member  of  a.  Thus,  if  "DF"  denotes 
the  class  of  differentiable  functions,  "/  e  DF"  shall  denote  that/ is  a  differ- 
entiable function.  Likewise,  if  "PN"  denotes  the  set  of  prime  numbers, 
then  "3  e  PN"  shall  denote  that  3  is  a  prime  number. 

The  notation  Xi,  X2,  .  .  .  ,  x^  t  a  shall  denote  (xi  e  a){x2  ea)  •  •  •  (a:„  e  a). 
When  necessary  to  avoid  ambiguity,  we  shall  enclose  x  e  am  parentheses. 
If  no  parentheses  are  used,  a;  e  a  is  to  have  the  minimum  possible  scope. 
We  shall  commonly  write  '^{x  e  a)  as  '^x  e  a.  This  is  possible  because  x 
will  never  be  a  statement,  and  as  '^  can  be  applied  only  to  statements,  it 
must  be  applied  to  the  complete  statement  x  e  a,  and  not  to  the  portion  x, 
which  is  not  a  statement.    Many  logicians  write  X'~  e  a  or  x  ^  a  for  '^{x  e  a). 

Intuitively  there  seems  to  be  a  vast  distinction  between  the  notions  of 
finite  and  infinite  class.  In  theory,  one  could  always  collect  together  the 
members  of  a  finite  class  and  thus  have  explicitly  before  one  a  totality 
comprising  the  class  in  question.  No  such  procedure  is  even  theoretically 
possible  with  an  infinite  class.  Thus,  in  dealing  with  an  infinite  class,  one 
is  laboring  under  a  handicap  of  intangibility  which  is  supposedly  not  present 
when  one  is  dealing  with  a  finite  class.  Of  course,  many  finite  classes  arising 
in  scientific  problems  are  quite  as  intangible  as  any  infinite  class.  The 
classes  of  stars  in  this  galaxy  or  atoms  in  this  sheet  of  paper  or  animals  on 
this  planet  are  as  incapable  of  comprehension  as  a  totality  as  is  the  class  of 
prime  numbers.    So  the  notion  of  a  class  as  an  assembled  totality  will  not 
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be  useful  for  us.  Although  one  may  think  of  the  simple  aggregates  encoun- 
tered in  daily  life  (e.g.,  family,  wardrobe,  etc.)  as  assembled  totalities,  this 
will  not  do  for  the  infinite  or  very  numerous  finite  classes  needed  in  mathe- 
matical discourse. 

How,  then,  shall  we  treat  these  classes?  Perhaps  the  first  conscious 
effort  to  deal  with  infinite  classes  came  with  the  study  of  geometrical  loci. 
A  geometrical  locus  is  the  set  of  points  satisfying  some  condition.  Though 
it  is  past  the  power  of  the  human  mind  to  conceive  of  the  totality  of  all 
points  in  the  usual  geometrical  locus,  the  defining  condition  can  be  easily 
grasped.  Consequently,  in  trying  to  treat  geometrical  loci,  it  was  found 
expedient  to  operate  with  the  defining  condition  rather  than  to  attempt  to 
manipulate  the  intangible  infinite  totality  which  actually  constitutes  the 
locus.  This  worked  so  well  that  it  has  been  extended  to  all  infinite  classes, 
so  that  now  the  standard  procedure  for  treating  an  infinite  class  is  to  find  a 
defining  condition  for  it  and  to  deal  with  the  defining  condition. 

A  condition  on  x  is  just  a  statement  containing  free  occurrences  of  x. 
Thus  the  statement,  "x  is  equidistant  from  the  points  A  and  5"  is  a  condi- 
tion on  X.  This  condition  determines  a  class  of  points,  namely,  the  per- 
pendicular bisector  of  the  line  joining  A  and  B.  When  we  say  that  a 
condition  determines  a  class,  we  mean  that  exactly  those  objects  are  mem- 
bers of  the  class  which  satisfy  the  condition.  That  is,  if  F(x)  is  a  condition 
and  a  is  a  class,  we  say  that  F{x)  determines  a  if  and  only  if  {x).x  ea  =  F(x). 

This  is  familiar  from  the  treatment  of  geometrical  loci.  To  show  that  the 
perpendicular  bisector  of  the  line  joining  A  and  B  is  the  locus  of  points 
equidistant  from  A  and  B  we  must  show  two  things : 

(1)  If  X  is  on  the  bisector,  it  is  equidistant  from  A  and  B. 

(2)  If  X  is  equidistant  from  A  and  B,  it  is  on  the  bisector. 

If  a  is  the  class  of  points  constituting  the  bisector,  so  that  x  e  a  means 
that  X  is  on  the  bisector,  and  if  F(x)  is  the  statement  that  x  is  equidistant 
from  A  and  B,  then  (1)  and  (2)  above  are,  respectively: 

(1)  {x).x  e  a  D  F(x). 

(2)  {x).F{x)   D  x  ea. 

When  one  has  proved  both  (1)  and  (2),  one  has 

(x).x  e  a  =  F{x). 

As  another  instance  of  a  class  being  defined  by  a  condition  F{x),  recall 
that  the  derived  set,  /3,  of  a  set,  a,  is  defined  by  the  condition  that  the 
members  of  /3  shall  be  limit  points  of  a.  In  this  case  F(x)  would  be  (e)  (Ey). 
y  7^  x.y  e  a.\  X  —  y  \  <  e,  and  we  recall  that  the  definition  of  fi  was  given  by 

{x):.x  €  /3:  =  :(6)(E?/).i/  9^  x.y  e  a.\  x  -  y  \   <  s. 
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In  quite  analogous  fashion,  if  we  wish  a  to  be  the  set  of  points  of  discon- 
tinuity of  a  function,  /,  we  write 

(x):.x  e  a:  ^  :(Ee)(8)(Ey).  \y-x\  <  5  ~(|  f(y)  -  J{x)  \  <  e). 

Other  instances  of  classes  which  are  determined  by  conditions  are  the 
class  of  prime  divisors  of  an  integer,  the  class  of  nth  roots  of  unity,  a  circle 
(all  points  at  a  distance  r  from  a  point  c),  etc. 

Clearly  conditions  can  define  finite  classes;  for  instance,  the  condition 
"x  is  an  even  prime"  defines  a  class  with  the  single  member  2.  Conversely, 
every  finite  class  is  determined  by  some  condition.  In  particular,  if 
ai,  a2,  .  .  .  ,  cin  are  the  members  of  the  finite  class,  then 

x  =  ttivx  =  a2W  '  ■  •  vx  =  ttn 

is  a  condition  determining  the  class.  Thus  the  procedure  of  defining  classes 
by  conditions  is  valid  for  finite  classes  as  well  as  for  infinite  classes  and  so 
has  won  general  acceptance  as  a  suitable  procedure  for  defining  all  kinds  of 
classes.  There  is  also  general  acceptance  of  the  principle  that  every  condi- 
tion determines  a  class,  and  every  class  has  a  determining  condition.  As  it 
happens,  this  principle  is  false.  Nevertheless,  belief  in  it  is  so  strong  that 
proofs  of  its  falsity  are  called  paradoxes  and  are  widely  ignored. 

To  show  that  not  every  class  has  a  determining  condition,  we  give  the 
following  proof,  known  as  Skolem's  paradox  (see  Skolem,  1929).  For  each 
real  number,  we  can  determine  at  least  one  class  of  real  numbers;  for  in- 
stance, the  class  of  all  smaller  real  numbers,  or  the  class  of  all  larger  real 
numbers  (the  two  classes  of  a  Dedekind  cut) ,  or  the  class  whose  sole  member 
is  the  given  real  number,  etc.  Thus,  the  set  of  all  classes  of  real  numbers  is 
not  denumerable,  since  the  set  of  real  numbers  is  not  denumerable.  Then 
the  set  of  all  classes  whatsoever  is  certainly  not  denumerable.  So,  if  every 
class  has  a  determining  condition,  the  set  of  conditions  must  be  non- 
denumerable.  As  every  condition  is  a  statement,  the  set  of  statements 
must  accordingly  also  be  nondenumerable.  However,  the  set  of  statements 
is  denumerable.  For  English  statements,  this  is  easily  shown,  since  each 
statement  is  a  finite  sequence  of  letters  of  the  English  alphabet,  and  there 
are  only  a  finite  number  of  letters  in  the  alphabet.  Even  with  a  denumer- 
able alphabet,  such  as  most  symbolic  logics  have,  the  number  of  statements 
would  still  be  denumerable. 

The  fairly  obvious  suggestions  have  been  made  that  one  might  either  use 
a  nondenumerable  alphabet  or  else  permit  statements  which  are  infinite 
sequences  of  letters  (see  Helmer,  1938).  We  cannot  conceive  how  to  do 
this  without  violating  our  basic  consideration  that  all  our  dealings  with 
statements  must  be  of  a  constructive  sort.  Moreover,  an  increase  in  the 
number  of  statements  would  not  necessarily  eliminate  the  Skolem  paradox. 
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If  one  has  as  many  statements  as  there  are  real  numbers,  for  instance,  one 
still  can  derive  the  Skolem  paradox  by  starting  with  some  still  more 
numerous  class,  such  as  the  class  of  all  functions  of  real  numbers. 

Some  people  propose  to  side-step  the  Skolem  paradox  by  saying  that  we 
have  prescribed  wrongly  what  a  condition  is.  We  have  said  that  a  condition 
on  a;  is  a  statement  with  free  occurrences  of  x.  If,  instead,  one  defines  a 
condition  as  a  proposition  with  free  occurrences  of  x,  and  if  one  is  sufficiently 
vague  as  to  what  a  proposition  is,  then  one  cannot  prove  that  the  set  of 
conditions  is  denumerable.  Thus  one  avoids  the  Skolem  paradox,  but  one 
must  then  admit  the  existence  of  propositions  which  cannot  be  expressed 
by  any  statement. 

We  fail  to  see  that  this  means  of  avoiding  the  Skolem  paradox  is  of  any 
value.  The  advantage  of  having  a  determining  condition  for  a  class  is  that, 
while  one  cannot  deal  directly  with  the  class  because  it  has  an  infinity  of 
members,  one  can  deal  with  the  condition,  because  it  is  a  finite  statement. 
If,  however,  we  allow  the  condition  to  become  a  vague  thing  called  a  propo- 
sition, about  which  we  know  essentially  no  more  than  we  do  about  the  class 
we  wish  to  deal  with,  then  we  lose  the  advantage  that  should  accrue  from 
having  a  determining  condition  for  our  class. 

Moreover,  it  really  doesn't  matter  that  there  should  be  classes  with  no 
determining  conditions  because  no  one  will  ever  exhibit  such  a  class.  As 
soon  as  one  exhibits  some  particular  class,  one  can  then  find  a  condition 
which  determines  that  class,  for  if  no  better  condition  is  available,  one  can 
take  the  condition  of  being  a  member  of  that  class.  Since  we  are  dealing 
with  an  explicit  class,  this  will  give  an  explicit  condition  for  that  class. 

The  above  argument  shows  that  there  must  be  some  classes  which  can 
never  be  explicitly  exhibited,  and  that  it  is  among  these  classes  that  the 
classes  will  occur  for  which  there  are  no  determining  conditions. 

So  we  stand  by  our  original  prescription  that  a  condition  on  x  shall  be  a 
statement  with  free  occurrences  of  x,  and  accept  the  fact  that  there  must 
therefore  be  some  classes  (which  can  never  be  exhibited)  for  which  there  is 
no  determining  condition. 

We  now  turn  to  a  much  more  painful  point,  namely,  the  proof  that  there 
are  conditions  which  determine  no  class.  At  least  three  such  proofs  are 
known.  They  are  called  the  Russell  paradox,  the  Cantor  paradox,  and  the 
Burali-Forti  paradox.  The  Cantor  paradox  depends  upon  the  theory  of 
cardinal  numbers,  and  the  Burali-Forti  paradox  depends  upon  the  theory 
of  ordinal  numbers,  and  so  discussion  of  these  two  paradoxes  will  be  post- 
poned until  we  have  developed  these  theories. 

However,  the  Russell  paradox  is  very  simple.  The  condition  F{x)  which 
Russell  considered  is  the  condition  ^^  x  e  x.  This  seems  a  perfectly  re- 
sDectable  condition.    In  fact,  it  is  satisfied  by  most  objects  x.    If  x  is  not  a 
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class,  it  can  have  no  members,  and  in  such  case  certainly  '^  x  ex.  Even  for 
many  classes  x,  we  have  ^-^  x  e  x.  For  instance,  let  x  be  the  class  of  prime 
numbers,  PA''.  This  class  is  certainly  not  a  prime  number,  so  that 
r^  PN  e  PN,  and  PN  satisfies  our  condition.  Similarly  for  the  class  of 
differentiable  functions,  and  many  other  classes.  As  many  classes  and  all 
nonclasses  satisfy  our  condition,  it  must  be  a  respectable  condition.  Never- 
theless, it  determines  no  class.  For  suppose  it  does;  then  there  is  an  a  such 
that 

(x)  (x  €  a  ^  '-^  X  e  x). 
In  symbols 

(Ea)(x)  (x  €  a  ^  '-^  X  ix). 

However,  by  Axiom  scheme  6, 

\-  (x)  (x  e  a  ^  '^^  X  e  x)  D   (a  e  a  ^  ^^  a  e  a). 

We  recollect  that  -^  o:  e  a  is  just  ^^{a  e  a).    So  we  have 

\-  (x)  (x  e  a  ^  "^  X  e  x)   D   (a  e  a  ^  -^(o:  e  a)). 

However,  by  truth  tables,  \-  (P  =  ~P)  D  Q~Q.    So 

\-  (x)  (x  e  a  =  '^  X  €  x)  D  Q'^Q. 

Hence  by  rule  G  and  Thm.VI.6.6,  Part  VII, 

|-  (Ea)  (x)  (x  e  a  =  '^  X  e  x)  D  Q'^Q. 

Hence,  if  the  condition  '^  x  ex  could  determine  a  class,  one  could  derive 
a  contradiction. 

The  Cantor  and  Burali-Forti  paradoxes  were  discovered  around  the 
same  time  as  the  Russell  paradox  but  they  are  quite  complex,  enough  so  to 
permit  plausible  doubt  of  their  validity  on  various  grounds.  Not  so  the 
Russell  paradox,  which  uses  only  simple  and  well-established  logical  princi- 
ples. Accordingly,  it  was  mainly  because  of  the  appearance  of  the  Russell 
paradox  that  a  serious  doubt  was  raised  against  the  validity  of  the  principle 
that  every  condition  should  determine  a  class. 

We  have  carefully  led  up  to  this  point  in  such  a  manner  as  to  suggest  that 
the  appropriate  reaction  to  Russell's  (and  Cantor's  and  Burali-Forti's) 
paradox  is  to  abandon  the  principle  that  every  condition  should  determine 
a  class;  indeed,  such  is  our  opinion.  Nevertheless,  resistance  to  such  a 
reaction  has  been  persistent  and  prolonged.  Various  alternative  measures 
have  been  proposed  to  preserve  the  principle  that  every  condition  shall 
determine  a  class. 

We  give  a  brief  survey  of  the  better  known  of  these  measures.  Let  us 
list  the  key  steps  in  the  proof  of  the  Russell  paradox. 
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1 ,  '^  X  €  X  is  Si  statement. 

'    2.  '^  X  €  X  contains  free  occurrences  of  x. 

3.  Hence  '^  a;  6  re  is  a  condition. 

4.  (x)  (x  €  a  =  '^  X  €  x)  is  the  statement  that  the  condition  '^  x  e  x 
determines  the  class  a. 

5.  (x)  {x  e  a  ^  '■^  X  e  x)   D   (a  e  a  =  ^^  a  e  cc). 

6.  (P  ^  ^P)  D  Q'^Q. 

7.  Every  condition  determines  a  class. 

Anyone  who  would  seek  to  preserve  the  validity  of  step  7  must  perforce 
deny  one  of  the  other  steps.  Step  2  seems  unexceptionable,  and  step  3 
merely  embodies  the  definition  of  a  condition.  Step  5  uses  the  principle 
that  a  statement  which  is  true  for  all  objects  x  must  be  true  for  a,  and  can 
hardly  be  denied.  This  leaves  only  steps  1,  4,  and  6  that  can  plausibly  be 
denied  if  one  wishes  to  preserve  the  validity  of  step  7. 

Various  proposals  for  eliminating  the  Russell  paradox  involve  denial  of 
one  or  more  of  steps  1,  4,  or  6. 

Step  6  depends  upon  the  system  of  truth  values  set  up  in  Chapter  II. 
To  deny  step  6  would  necessitate  devising  a  new  system  to  replace  the 
statement  and  predicate  calculus.  This  is  a  large  undertaking.  It  has- 
been  attempted  by  a  few  people  but  not  yet  completed  by  anyone.  A 
favorite  starting  point  is  to  assume  three  truth  values  instead  of  the  two 
truth  values  of  Chapter  II,  and  thus  get  a  "three-valued"  logic.  This 
eliminates  step  6,  but  unless  the  three-valued  logic  is  set  up  with  a  great 
deal  of  care,  alternative  forms  of  the  Russell  paradox  can  still  be  derived. 
A  particular  three-valued  system  set  up  by  Bocvar  (see  Bocvar,  1939)  does 
appear  to  avoid  the  Russell  paradox.  However,  Bocvar's  system  is  only 
very  partially  developed,  and  it  still  remains  to  be  seen  if  Bocvar's  system 
avoids  the  Cantor  and  Burali-Forti  paradoxes.  Even  if  it  does,  it  represents 
a  violent  departure  from  accepted  mathematical  reasoning  and  so  is  not 
likely  to  become  popular  any  time  in  the  near  future. 

On  the  whole,  attempts  to  deny  step  6  have  not  been  very  successful. 

There  are  systems  of  logic  in  widespread  use  which  deny  step  4.  These 
developed  from  a  system  invented  by  Zermelo  and  are  often  called  sj^stems 
of  Zermelo  set  theory,  or  just  "set  theory"  for  short  (see  Zermelo,  1908, 
second  paper;  Godel,  1940;  and  Quine,  1951).  In  set  theory,  a  substitute 
for  step  4  is  based  on  the  following  notion.  Let  us  consider  that  there  is  a 
distinction  between  sets  and  classes.  All  sets  are  classes,  but  some  classes 
are  nonsets.  Often  the  term  "individual"  is  used  instead  of  "set,"  so  that 
some  classes  are  individuals,  and  some  are  nonindividuals,  while  all  indi- 
viduals are  classes.  The  distinction  between  a  set  and  a  nonset  is  that  sets 
can  be  members  of  classes  but  nonsets  cannot  be  members  of  anything. 
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It  turns  out  that  every  set  actually  is  a  member  of  some  class.    In  other 
words,  re  is  a  set  if  and  only  if  (E/3)  x  e  ^. 

As  nonsets  cannot  be  members  of  anything,  the  members  for  any  class 
must  necessarily  be  chosen  from  among  sets  only.  Hence,  the  class  de- 
termined by  any  condition  F{x)  shall  consist  not  of  all  objects  x  which 
satisfy  F(x),  but  only  of  all  sets  x  which  satisfy  F(x).  In  other  words,  we 
express  the  statement  "F{x)  determines  the  class  a"  by 

{x).x  e  a  =  F{x)  {x  is  a  set), 

which  is  equivalent  to 

{x).x  e  a  =  F(a:)(E/3)  x  e  (3. 

Then  step  4  is  replaced  by 

"(x).x  e  a  =  ("^  X  €  x)  (E/3)  a:  e  j8  is  the  statement  that  the  condition 
'^  X  e  X  determines  the  class  a". 

If  one  assumes  that  there  is  a  class  a  determined  by  '^  a;  e  x,  one  can  then 
infer  a  ea  =  (^^  a  ea)  (E/3)  a  e  ^.  Applying  truth  values  to  this  leads  to  no 
contradiction,  but  only  to  the  conclusion  '~(E/3)  a  e  (3.  That  is,  a  is  a 
nonset. 

So  in  Zermelo  set  theory,  the  condition  ^^  x  ex  determines  a  class  but  this 
class  is  a  nonset.  Similarly  both  the  Cantor  and  Burali-Forti  paradoxes 
depend  at  a  critical  point  on  some  class  being  a  set  and  so  will  fail  to  go 
through  because  the  class  in  question  can  be  and  apparently  is  a  nonset. 

There  is  still  a  difficulty  to  be  surmounted.  Most  of  the  classes  arising  in 
mathematics  have  to  be  members  of  something  in  one  proof  or  another  and 
hence  must  be  sets.  So  one  must  devise  a  criterion  for  deciding  which 
classes  are  sets,  and  this  criterion  must  admit  as  sets  most  classes  of  mathe- 
matics without  admitting  as  sets  the  critical  classes  arising  in  the  Russell, 
Cantor,  and  Burali-Forti  paradoxes.  One  basic  criterion  often  used  is 
roughly  that  classes  with  an  excessively  large  number  of  members  are 
nonsets.  For  details,  we  refer  the  reader  to  Godel,  1940,  or  Quine,  1951, 
in  which  distinct  criteria  are  employed. 

On  the  whole,  denial  of  step  4  in  the  manner  just  indicated  works  reason- 
ably well.  Alternatively,  a  considerable  number  of  mathematicians  deny 
step  1.  The  two  best-known  schools  of  thought  in  this  doctrine  are  those 
of  Brouwer  and  Russell,  who  have  contrived  definitions  of  being  a  state- 
ment according  to  which  '~  x  e  a:  is  not  a  statement. 

Brouwer  (see  Heyting,  1934)  seems  to  say  that  a  collection  of  words, 
F(x),  is  a  statement  only  in  case  there  is  a  constructive  method  wherebj^ 
for  each  x  one  can  determine  definitely  whether  F(x)  is  true  or  false.  Thus 
''x  is  a  prime"  is  a  statement  because  one  can  test  the  primality  of  x  by 
dividing  it  by  all  integers  <\^x.     On  the  other  hand,  "x  is  an  integer 


204  LOGIC  FOB  MATHEMATICIANS  [Chap.  IX 

exponent  such  that  there  are  nonzero  integers  a,  b,  and  c  with  a'  -{- h'  =  c'" 
is  perhaps  not  a  statement,  because  there  is  no  way  known  to  decide  for  a 
given  X  whether  the  statement  is  false  or  true.  Brouwer's  restriction  is 
drastic  but  effective.  There  is  no  constructive  method  whereby  for  each  x 
one  can  determine  definitely  whether  '^  x  e  x,  and  so  '^  a;  e  a:  is  not  a 
statement.  In  fact,  step  6  is  also  thrown  out,  since  it  is  not  a  statement 
either. 

Brouwer  has  got  some  followers,  but  since  his  criterion  apparently  rules 
that  most  of  the  theorems  of  mathematics  are  not  statements,  there  has 
been  no  general  acceptance  of  his  views. 

Russell  was  much  more  moderate  in  his  proposal.  He  recognized  that 
one  should  set  up  the  criterion  for  what  a  statement  is  in  such  a  way  that 
most  theorems  of  mathematics  would  remain  statements,  but  '^  x  e  x 
would  fail  to  be  a  statement.  He  proposed  his  famous  theory  of  types  to 
provide  such  a  criterion.  According  to  this,  one  conceives  of  the  objects 
of  discourse  arranged  into  types.  The  objects  of  type  n  shall  be  members 
only  of  objects  of  type  n  +  1,  and  objects  of  type  n  -\-  I  shall  have  as 
members  only  objects  of  type  n.  Any  sentence  violating  these  conditions 
is  to  be  outlawed,  and  not  considered  a  statement. 

Clearly  x  e  x  violates  the  conditions,  because  whatever  type  x  is  in,  any 
member  of  x  must  belong  to  the  next  lower  type  and  so  cannot  be  x.  So 
a:  e  a:  is  outlawed,  and  with  it  ^^  x  ex.  Russell  pointed  out  that  certain 
sentences  in  the  derivations  of  Cantor's  paradox  and  Burali-Forti's  paradox 
would  likewise  be  outlawed  by  his  theory  of  types,  so  that  use  of  the  theory 
of  types  would  eliminate  these  paradoxes  also. 

One  can  reduce  Russell's  theory  of  types  to  a  mechanical  rule.  Consider 
any  sentence  P  and  consider  the  parts  of  it  of  the  form  x  e  a.  If  P  is  not  to 
violate  the  type  restrictions,  x  must  be  in  a  type  exactly  one  lower  than  a 
for  every  part  of  the  form  x  e  a  in  P.  Suppose  we  attach  as  subscripts  to 
the  x's  and  a's  the  numbers  of  the  types  to  which  they  belong.  Then  we 
have  attached  numerical  subscripts  to  the  variables  in  such  a  way  that,  in 
every  part  of  the  form  x  e  a,  the  a  has  a  subscript  exactly  one  greater  than 
the  subscript  on  the  x. 

Quine  (see  Quine,  1937)  has  suggested  that  any  sentence  for  which  such 
an  assignment  of  subscripts  is  possible  be  called  stratified.  Thus  we  see 
that  the  theory  of  types  would  outlaw  every  sentence  which  is  not  stratified 
and  leave  only  stratified  sentences  as  statements. 

Actually  the  theory  of  types  would  outlaw  many  other  sentences  besides 
those  which  are  not  stratified.  The  original  theory  of  types,  as  set  forth  by 
Whitehead  and  Russell  in  "Principia  Mathematica,"  was  a  wondrously 
complex  affair  involving  multiple  subscripts  not  only  on  variables  but  on 
statements  also.  Ramsey  (see  Ramsey,  1926)  later  introduced  the  so-called 
"simplified  theory  of  types."    In  this  system,  single  subscripts  are  attached 
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to  variables,  and  to  variables  only,  but  they  are  permanently  attached. 
Then  a  sentence  is  a  statement  if  and  only  if  for  every  part  of  the  form 
X  e  a,  the  subscript  permanently  attached  to  a  is  exactly  one  greater  than 
the  subscript  permanently  attached  to  x. 

Quine  (see  Quine,  1937)  has  suggested  as  a  still  simpler  theory  of  types 
that  one  simply  admit  as  statements  all  sentences  which  are  stratified. 
This  is  appreciably  more  liberal  than  Ramsey's  system.  Quine  would 
admit  both  x  e  y  and  y  e  x  as  statements,  because  each  is  (separately) 
stratified.  Thus,  to  stratify  x  e  y,  we  attach  subscripts  so:  a^i  e  7/2.  Simi- 
larly, to  stratify  y  e  x,  we  attach  subscripts  so:  2/1  e  Xz.  However,  Ramsey 
has  subscripts  permanently  attached  to  each  variable,  and  so  in  both  x  ey 
and  y  ex  there  would  already  be  subscripts  on  x  and  y,  and  it  would  be  the 
values  of  these  permanently  attached  subscripts  which  would  determine 
whether  x  e  y  or  y  e  x  should  be  statements.  For  x  e  y  to  be  a  statement, 
the  subscript  on  x  must  be  one  less  than  the  subscript  ony.  liy  ex  is  to  be 
a  statement,  then  the  subscript  on  y  must  be  one  less  than  the  subscript 
on  X.  Accordingly,  for  Ramsey,  it  could  never  be  that  both  x  ty  and  y  ex 
are  statements,  and  often  neither  would  be  a  statement.  However  x  e  x, 
and  hence  ^^  x  e  x,  would  not  be  a  statement  for  either  Quine  or  Ramsey. 

Quine's  theory  of  types  seems  adequate  to  avoid  the  known  paradoxes, 
and  is  much  less  cumbersome  than  the  Russell  or  Ramsey  theory  of  types. 
However,  there  is  one  disconcerting  feature  of  the  Quine  theory  of  types. 
Both  X  ey  and  y  e  x  are  stratified,  as  we  saw,  so  that  neither  is  outlawed. 
However,  their  logical  product,  {x  ey){y  ex),  is  not  stratified,  and  so  must 
be  outlawed.  When  x's  occur  in  different  statements,  &b  x  e  y  and  y  e  x, 
they  may  have  different  subscripts  attached  for  the  purposes  of  stratifica- 
tion, since  separate  sentences  are  to  be  stratified  separately.  However, 
throughout  a  single  sentence,  such  as  {x  e  y){y  e  x),  one  must  attach  the 
same  subscript  to  all  the  x's  that  occur.  The  same  applies  to  y,  and  so  one 
easily  sees  that  {x  e  y)(y  e  x)  cannot  be  stratified,  though  its  parts  can  be 
stratified  separately. 

All  in  all,  denial  of  step  1  is  feasible,  but  not  wholly  satisfactory. 

There  still  remains  the  possibility  of  denying  step  7,  which  appeals  to  us 
as  the  most  natural  of  all.    We  adopt  this  procedure. 

As  we  are  not  denying  step  1,  we  have  no  need  to  outlaw  various  types  of 
sentences.  We  shall  consider  any  declarative  sentence  to  be  a  statement. 
In  particular,  --^  a;  e  a:  is  to  be  considered  a  perfectly  legitimate  statement 
and  hence  is  a  condition.  However,  we  do  not  claim  that  every  condition 
shall  determine  a  class.  In  fact,  from  our  point  of  view,  the  Russell  paradox 
is  merely  a  proof  that  '-^  x  e  x  does  not  determine  a  class.  Similarly,  the 
Cantor  and  Burali-Forti  paradoxes  are  merely  proofs  that  a  couple  of 
other  statements  do  not  determine  classes. 

We  now  wish  a  criterion  for  deciding  which  conditions  shall  determine 
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classes.  Clearly  this  criterion  must  exclude  the  statements  arising  in  the 
paradoxes,  while  including  all  the  conditions  commonly  used  in  mathe- 
matics to  determine  classes.  Our  problem  is  analogous  to  the  problem  in 
Zermelo  set  theory  of  finding  a  criterion  which  would  classify  most  classes 
of  mathematics  as  sets,  but  which  would  classify  as  nonsets  the  classes  aris- 
ing in  connection  with  the  paradoxes.  So  far,  no  means  of  adapting  the 
solution  of  the  set-theory  problem  to  our  problem  has  been  suggested. 
Instead,  we  have  adopted  a  suggestion  of  Quine  (see  Quine,  1937)  that 
stratification  could  be  used  as  the  criterion.  This  criterion  is  to  be  used  in 
a  positive  rather  than  a  negative  sense.  That  is,  we  require  that,  if  a  con- 
dition is  stratified,  then  it  shall  determine  a  class,  but  we  do  not  insist  that, 
if  a  condition  is  unstratified,  then  it  may  not  determine  a  class.  It  will 
turn  out  in  fact  that  many  unstratified  conditions  do  determine  classes. 
Fortunately,  this  is  not  true  of  the  critical  statements  arising  in  the  known 
paradoxes,  as  far  as  can  be  determined,  and  so  the  known  paradoxes  merely 
serve  to  prove  that  certain  unstratified  conditions  fail  to  determine  classes. 

The  resulting  system,  which  is  used  in  the  present  text,  is  known  as 
"Quine's  New  Foundations,"  after  the  title  of  the  paper  in  which  it  was 
proposed  (see  Quine,  1937). 

It  will  perhaps  have  occurred  to  the  reader  that,  since  stratification 
seems  to  provide  us  with  an  answer  to  our  problem  of  deciding  which 
conditions  shall  determine  classes,  it  might  provide  an  answer  to  the  analo- 
gous problem  in  Zermelo  set  theory  of  deciding  which  classes  shall  be  sets 
and  which  nonsets.  This  was  suggested  by  Quine  (see  Quine,  1940),  who 
proposed  that  all  classes  determined  by  stratified  conditions  be  admitted  as 
sets.  Actually,  this  is  too  liberal.  It  seems  to  avoid  the  Russell  and  Cantor 
paradoxes,  but  permits  the  Burali-Forti  paradox  (see  Rosser,  1942).  How- 
ever, by  adding  one  additional  very  minor  restriction,  Quine's  proposal 
appears  to  work  quite  well  (see  Quine,  1951). 

There  is  a  warning  in  this.  The  problem  of  the  exact  relationship  be- 
tween classes  and  statements  is  a  difficult  and  subtle  one  and  must  be 
approached  quite  warily.  Undoubtedly  the  last  word  on  the  subject  has  not 
yet  been  said.  All  the  present  suggestions  for  avoiding  the  paradoxes  retain 
a  tinge  of  artificiality.  Certainly  the  theory  of  types  is  artificial.  In  the 
Zermelo  set  theory,  the  distinction  between  sets  and  nonsets  is  irksome, 
and  the  various  criteria  for  deciding  between  sets  and  nonsets  are  not 
intuitively  very  natural. 

In  the  system  of  the  present  text,  Quine's  New  Foundations,  it  is  irksome 
that  not  all  conditions  determine  classes,  and  the  criterion  of  stratification 
for  deciding  which  conditions  shall  determine  classes  is  not  intuitively 
natural. 

Both  set  theory  and  Quine's  New  Foundations  reproduce  mathematical 
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reasoning  with  a  remarkably  close  approximation  to  the  unhampered  meth- 
ods in  use  before  the  discovery  of  the  paradoxes.  Certain  regions  of  the 
theory  of  cardinals  and  ordinals  are  too  near  the  paradoxes  to  survive 
unchanged,  but  it  has  been  possible  to  amputate  the  known  paradoxes  with 
remarkably  little  injury  to  the  main  body  of  mathematics. 

Still  other  means  of  avoiding  the  paradoxes  have  been  suggested.  Church 
and  Curry  agree  with  us  in  denying  that  every  condition  must  determine  a 
class,  but  their  suggestions  are  very  drastic.  In  effect,  they  propose  dis- 
pensing with  classes  altogether,  and  dealing  only  with  the  conditions  them- 
selves. In  order  to  succeed  in  such  an  endeavor,  they  must  allow  that  one 
operate  on  conditions  in  a  way  which  is  usually  not  allowed  (and  usually 
not  necessary,  because  one  can  ordinarily  use  operations  on  classes  instead) . 
In  their  first  attempt.  Church  and  Curry  were  too  liberal  in  allowing  opera- 
tions on  conditions  and  became  involved  in  a  rather  intricate  and  unfore- 
seen contradiction  (see  Kleene  and  Rosser).  Their  present  suggestions 
(see  Church,  1941,  and  Curry,  1942)  are  more  conservative  and  are  prob- 
ably free  from  contradictions.  However,  these  suggestions  have  not  yet 
been  thoroughly  developed. 

In  summary,  if  one  wishes  something  as  like  classical  mathematics  as 
possible,  one  has  at  present  the  option  of  choosing  either  a  Zermelo  set 
theory  or  Quine's  New  Foundations  or  a  version  of  type  theory. 

In  view  of  the  fact  that  the  paradoxes  arose  from  logical  principles  which 
had  been  generally  accepted  for  thousands  of  years,  one  might  wonder  if 
one  does  not  run  a  risk  with  any  of  the  presently  proposed  systems  that  a 
contradiction  may  appear  after  the  system  has  been  generally  accepted 
and  in  use  for  a  long  time.  Apparently  one  does.  According  to  a  theorem 
of  Godel  (see  Godel,  1931),  if  a  system  of  logic  is  adequate  for  even  a  reason- 
able facsimile  of  present-day  mathematics,  then  there  can  be  no  adequate 
assurance  that  it  is  free  of  contradiction.  Failure  to  derive  the  known 
paradoxes  is  very  negative  assurance  at  best  and  may  merely  indicate  lack 
of  skill  on  our  part.  Thus  it  would  be  very  unwise  to  adopt  a  firm  belief  in 
the  validity  of  any  of  the  better-known  systems  of  logic.  One  should 
always  remain  aware  of  the  possibility  that  someone  may  discover  a  con- 
tradiction in  the  system  to  which  one  has  become  accustomed  and  compel 
a  change  to  some  other  system.  In  fact,  there  is  the  awkward  possibility 
that  present-day  mathematics  is  actually  in  serious  error,  so  that  any  formal 
system  which  gives  a  reasonable  facsimile  of  present-day  mathematics 
must  contain  a  contradiction.  We  do  not  believe  this  to  be  the  case,  but 
we  can  offer  no  reason  why  it  should  not  be  the  case. 

2.  Axioms  for  Classes.  One  of  our  axioms  for  classes  will  have  to  state 
that  stratified  conditions  shall  determine  classes.  To  state  this  axiom 
precisely,  we  shall  need  a  precise  definition  of  stratification  and  of  what  a 
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statement  is.  Up  to  now,  we  have  not  been  able  to  give  a  precise  definition 
of  what  a  statement  is,  because  we  had  not  yet  introduced  all  the  constitu- 
ents of  statements.  Now,  with  the  introduction  of  the  notion  of  class 
membership,  we  have  available  all  constituents  of  statements  and  can 
give  a  precise  definition  of  what  a  statement  is.  Also,  we  can  now  define 
equality  and  replace  Axiom  schemes  7A  and  7B  by  a  single  weaker  axiom 
scheme.  Accordingly,  this  seems  a  good  place  to  restate  our  former  axioms 
as  well  as  stating  the  new  axioms  for  classes. 

We  now  start  from  scratch.  The  constituents  which  we  shall  allow  for 
building  up  statements  are: 

(  ,  ),  '~,  &,  I,  e,  variables. 

With  regard  to  the  use  of  ''("  and  "),"  we  should  make  it  clear  that  our 
former  carefree  and  indiscriminate  use  of  "(  )",  "{  }",  and"[  ]"  is  not 
permitted  in  a  strict  formulation  such  as  we  now  present.  Nevertheless, 
when  we  have  finished  presenting  the  strict  formulation,  we  shall  revert  to 
our  former  practice  of  omitting  parentheses  when  it  can  be  done  unam- 
biguously, and  of  using  different  styles  of  parentheses,  or  brackets,  or 
braces,  or  dots,  whenever  it  will  help  the  eye. 

The  supply  of  variables  is  supposed  to  be  inexhaustible.  As  we  never 
need  more  than  a  finite  number  at  a  time,  it  suffices  to  have  a  denumerable 
set  of  variables. 

We  now  define  "term",  "statement",  "free",  and  "bound". 

(a)  Each  variable  is  a  term;  in  this  term  the  occurrence  of  the  variable 
in  question  is  free. 

(b)  If  A  and  B  are  terms,  then  (A  e  B)  isa,  statement;  the  occurrences  of 
variables  in  the  parts  A  and  B,  respectively,  are  free  or  bound  in  (A  e  B) 
according  as  they  are  free  or  bound  in  A  and  B,  respectively. 

(c)  If  X  is  a  variable  and  P  is  a  statement,  then  (x)  P  is  a  statement  and 
tx  P  is  a  term;  in  these  all  occurrences  of  x  are  bound,  but  for  each  variable 
different  from  x  the  occurrences  in  the  part  P  are  free  or  bound  in  (x)  P  and 
IX  P  according  as  they  are  free  or  bound  in  P. 

(d)  If  P  and  Q  are  statements,  then  '~P  and  (P&Q)  are  statements;  the 
occurrences  of  variables  in  '^P  are  free  or  bound  according  as  they  are  free 
or  bound  in  P,  and  the  occurrences  of  variables  in  the  parts  P  and  Q, 
respectively,  are  free  or  bound  in  (P&Q)  according  as  they  are  free  or  bound 
in  P  and  Q,  respectively. 

It  is  intended  that  those  combinations  of  symbols,  and  only  those  com- 
binations of  symbols,  are  terms  or  statements  which  can  be  sho^vn  to  be 
terms  or  statements  by  repeated  use  of  (a),  (b),  (c),  and  (d). 

Unless  otherwise  specified,  it  will  be  taken  for  granted  in  any  context 
that  whenever  (A  e  B)  is  written  A  and  B  are  terms;  whenever  {x)  P  or  ixP 
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is  written  a:  is  a  variable  and  P  is  a  statement;  whenever  ~P  is  written  P 
is  a  statement;  and  whenever  (P&Q)  is  written  P  and  Q  are  statements. 
This  convention  is  to  be  automatically  extended  to  abbreviations,  such  as 
PwQ,  (E.t)  P,  etc. 

Any  occurrence  of  (x)  P  or  ix  P  as  a,  part  of  S  will  be  called  an  x-bound 
part  of  S,  and  P  will  be  called  the  scope  of  (x)  or  lx.  One  can  prove  by 
induction  on  the  number  of  symbols  of  S  that,  if  a;  is  a  variable  and  *S  is  a 
statement,  then  those  occurrences  (if  any)  of  x  in  an  a;-bound  part  of  S  are 
bound  in  S,  and  all  other  occurrences  (if  any)  of  a;  in  ;S  are  free  in  S. 

If  Xi,  X2,  .  .  .  ,  Xn  are  distinct  variables,  then  {Sub  in  *S:  ^1  for  a:,,  A2  for 
X2,  .  .  .  ,  An  for  x„}  shall  denote  the  result  of  simultaneously  replacing  each 
free  occurrence  (if  any)  of  a:,-  in  S  hy  A^  {i  —  1,  2,  .  .  .  ,  n). 

One  can  prove  by  induction  on  the  number  of  symbols  of  S  that,  if 
Ai,  A2,  .  .  .  ,  An  are  each  terms,  and  *S  is  a  term  or  a  statement,  then  {Sub  in 
S:  A I  for  a^i,  A  2  ior  X2,  ...  ,  ^„  for  a;„|  is  correspondingly  a  term  or  a 
statement. 

Whenever  we  use  the  notation,  it  is  to  be  understood  that  /S  is  a  term  or 
statement  and  A^,  A2,  .  .  .  ,  A^  are  each  terms. 

We  say  that  the  substitution  indicated  by  {Sub  in  *S:  Ai  for  a;,,  A2  for  X2, 
.  .  .  ,  Aniov  Xn]  causes  confusion  of  bound  variables  if  there  is  a  variable  y 
and  an  «■  (1  <  i  <  n)  such  that  there  is  a  free  occurrence  of  y  in  A.-  and  there 
is  an  occurrence  of  a;,-  in  S  which  is  free  in  S  and  occurs  in  some  ?/-bound 
part  of  S]  otherwise  we  say  that  the  substitution  causes  no  confusion  of 
bound  variables. 

This  is  merely  a  precise  formulation  of  our  earlier  definition  of  "confusion 
of  bound  variables." 

We  shall  commonly  use  more  convenient  notations  in  place  of  {Sub  in  S: 
Ai  for  a^i,  A2  for  X2,  .  .  . ,  An  for  Xn]  in  those  cases  in  which  the  substitution 
indicated  causes  no  confusion  of  bound  variables.  Thus  we  often  write 
F{x,y)  for  S  and  F{y,y)  for  {Sub  m  S'.y  iov  x\.  Also  we  often  write  F{x) 
for  S  and  F{Ly  Q)  for  {Sub  in  *S:  ly  Qiov  x]. 

A  formula  S  is  said  to  be  stratified  if  one  can  attach  an  integer  subscript 
to  each  occurrence  of  each  variable  in  S  in  such  a  way  that: 

(a)  If  a:  is  a  variable  and  P  a  statement  which  is  part  of  S,  then  all  occur- 
rences of  X  which  are  free  in  P  shall  have  the  same  subscript  attached. 

(b)  If  X  is  a  variable  which  occurs  free  in  a  statement  P,  then  in  each 
particular  occurrence  (if  any)  of  (.r)  P  or  ta:  P  as  a  part  of  8,  the  explicitly 
indicated  occurrence  of  x  in  the  prefix  {x)  or  lx  shall  have  the  same  sub- 
script attached  as  the  free  occurrences  of  x  in  that  particular  occurrence 
of  P. 

(c)  In  each  occurrence  in  >S  of  a  part  of  the  form  (x  e  y),  {lx  P  e  y), 
(x  i  ly  Q),  or  (ix  P  €  ly  Q)  in  which  x  and  y  are  variables  and  P  and  Q  are 
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statements,  the  subscript  attached  to  the  occurrence  of  y  which  is  expHcitly 
indicated  shall  be  exactly  one  greater  than  the  subscript  attached  to  the 
occurrence  of  x  which  is  explicitly  indicated. 

A  discussion  of  this  definition  is  in  order.  As  indicated  in  the  previous 
section,  stratification  is  to  be  a  mechanical  equivalent  of  the  simple  theory 
of  types  as  applied  to  any  particular  formula.  Throughout  any  given 
formula,  all  free  occurrences  of  a  given  variable  would  be  understood  as 
having  reference  to  the  same  object  (this  is  the  standard  procedure  in 
mathematics  with  regard  to  the  use  of  variables)  and  hence  would  all  have 
to  be  in  the  same  type  and  hence  should  have  the  same  subscript  attached. 
This  we  require  in  part  (a)  of  the  above  definition  where  we  interpret  "part" 
in  the  liberal  sense  which  considers  *S  a  part  of  itself.  However,  bound 
occurrences  of  x  in  one  part  of  S  would  generally  have  no  connection  with 
free  occurrences  of  x  elsewhere  in  S,  and  so  could  have  different  subscripts 
attached.  This  is  permitted  by  part  (b).  However,  throughout  any  x- 
bound  part  of  S,  uniformity  of  type  for  x  is  required,  and  this  is  the  meaning 
of  part  (b).    One  might  be  tempted  to  state  part  (b)  in  the  simpler  form: 

(b*)  All  occurrences  of  x  in  any  particular  occurrence  of  (x)  P  or  tx  P 
shall  have  the  same  subscript  attached. 

However,  this  is  too  stringent.  P  may  contain  some  free  occurrences  of 
X,  all  with  the  subscript  i,  and  some  bound  occurrences  of  x,  all  with  the 
subscript  j.  Then  the  x  which  goes  on  the  front  as  part  of  (x)  or  lx  should 
also  have  the  subscript  i,  but  one  would  not  require  equality  of  i  and  j. 
Hence  we  use  (b)  rather  than  (b*) . 

Part  (c)  is  the  crux  of  the  definition.  It  says  that  a  member  of  a  class 
shall  be  of  type  exactly  one  lower  than  the  class.  Moreover,  part  (c)  has 
the  consequence  that  ix  P  shall  be  considered  to  have  the  same  type  as  any 
free  x's  occurring  in  P.  This  last  is  quite  to  be  expected  if  one  recalls 
the  meaning  of  lx  P. 

Sometimes  it  is  stated  as  a  requirement  for  stratification  that  all  sub- 
scripts should  be  positive.  This  is  quite  a  trivial  requirement,  for  if  one 
has  an  assignment  of  subscripts  satisfying  (a),  (b),  and  (c),  one  can  add 
some  integer  uniformly  to  all  subscripts  and  get  another  assignment  of 
subscripts  satisfying  (a),  (b),  and  (c).  So  if  a  formula  can  be  stratified,  it 
can  be  stratified  with  all  subscripts  positive. 

One  can  readily  make  a  direct  test  for  stratification  of  any  particular 
formula  S.  Choose  a  variable  occurring  in  S  and  choose  a  subscript,  zero 
perhaps.  Attach  this  subscript  to  an  occurrence  of  the  variable.  Then  (a) 
and  (b)  will  perhaps  require  attaching  the  same  subscript  to  other  occur- 
rences of  the  variable.  If  already  (c)  is  violated,  our  formula  is  not  strati- 
fied. If  (c)  is  not  violated,  then  (c)  will  usually  require  attaching  appropri- 
ate subscripts  to  further  occurrences  of  variables.    Then  (a)  and  (b)  may 
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require  still  further  attachments.  These  may  perhaps  violate  (c),  in  which 
case  the  formula  is  not  stratified.  Otherwise,  use  (c)  to  determine  still 
other  attachments.  Ordinarily  this  process  will  proceed  until  one  has 
either  stratified  the  formula  or  proved  it  unstratified.  However,  occasion- 
ally the  process  will  terminate  and  there  will  still  remain  occurrences  of 
variables  with  no  subscripts  attached.  One  merely  attaches  a  subscript 
to  one  of  them  and  starts  the  process  going  again.  If  one  repeats  this  sort 
of  thing  enough,  one  eventually  will  either  stratify  the  formula  or  prove 
that  it  cannot  be  stratified. 

Although  we  are  not  using  the  theory  of  types  at  all,  nevertheless  strati- 
fication had  its  origin  in  the  theory  of  types,  and  it  is  convenient  to  speak 
of  the  subscript  attached  to  a  variable  as  the  type  of  the  variable.  In  this 
connection,  we  also  speak  of  the  type  of  lx  P,  by  which  we  mean  the 
subscript  attached  to  the  explicitly  indicated  occurrence  of  x  in  the  prefix  tx. 

The  reader  is  perhaps  troubled  by  the  fact  that,  no  matter  what  the 
terms  A  and  B  are,  we  permit  formation  of  the  statement  (A  e  B),  which 
would  make  sense  only  if  5  is  a  class.  That  is,  we  have  made  no  provision 
for  nonclasses.  To  put  it  otherwise,  we  admit  as  statements  only  sentences 
concerned  exclusively  with  classes.  It  might  be  argued  with  good  reason 
that  this  is  an  unnecessarily  restricted  definition  of  a  statement.  However, 
it  is  sufficiently  broad  for  the  development  of  mathematics,  which  is  our 
sole  aim  at  present,  and  so  we  take  it  as  our  definition.  It  is  certainly 
possible  for  anyone  to  extend  the  definition  of  a  statement  by  allowing  as 
statements  additional  sentences  besides  those  prescribed  by  our  definition 
above.  Such  a  person  would  then  have  to  choose  appropriate  axioms  to  go 
with  his  new  definition  of  a  statement.  However,  we  leave  such  matters 
to  someone  else  and  proceed  with  statements  which  speak  only  of  classes. 

Since  all  our  objects  of  discourse  are  classes,  we  can  take  as  the  criterion 
of  equality  of  two  objects  just  the  criterion  of  equality  of  two  classes. 
Two  classes  are  equal  if  and  only  if  they  have  the  same  members.  Some 
philosophers  would  not  accept  this,  but  mathematicians  generall}^  would, 
and  we  shall.    Thus  a  =  /3  if  and  onh^  if 

(x)  (x  e  a  =  X  €  /3). 

Accordingly,  we  adopt  the  definition : 

If  a;  is  a  variable  and  A  and  B  are  terms  and  x  does  not  occur  in  either  A 
or  B,  then 

(x)  (x  t  A  ^  x  €  B) 

shall  be  abbreviated  to 

A  =  B 
or  occasionally  to 

A  =^B 
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in  case  we  wish  to  indicate  just  which  x  we  are  using.    On  occasion  we  write 
A  =  B  =  C        for        (A  =  B)(B  =  C), 
A  =  B  =  C  =  D        for        (A  =  B)iB  =  C)(C  =  D), 
etc. 

Clearly  the  free  occurrences  of  variables  mA=B  are  just  the  free  occur- 
rences in  A  or  B,  and  the  bound  occurrences  are  all  occurrences  of  x  plus 
the  bound  occurrences  in  A  or  5. 

If  one  is  trying  to  stratify  a  formula  containing 

A=^B, 

one  can  see  by  referring  to  the  definition  that  both  A  and  B  would  have 
to  have  a  type  one  higher  than  the  type  of  x.    That  is,  in 

A  =  B, 

A  and  B  would  have  to  have  the  same  type.  Conversely,  if  they  can  be 
simultaneously  stratified  in  such  a  way  that  they  have  the  same  type,  then 
A  =  B  can  be  stratified.  This  can  be  put  in  more  exact  form.  Suppose  we 
are  trying  to  determine  if  S  is  stratified,  and  S  contains  a  part  A  =  B.  To 
apply  the  definition  of  stratification  as  stated,  one  would  expect  to  have  to 
replace  A  =  B  by  its  definition.  However,  this  is  not  necessary,  since  it 
suffices  to  assign  the  same  types  to  A  and  B.  In  effect  then,  we  are  avoiding 
the  necessity  of  replacing  A  =  B  by  its  definition  by  adding  to  the  specifi- 
cations for  stratification  another  part : 

(d)  In  each  occurrence  in  *S  of  a  part  of  the  form  x  =  y,  ix  P  =  y, 
X  =  ly  Q,  or  LX  P  =  ly  Q  in  which  x  and  y  are  variables  and  P  and  Q  are 
statements,  the  same  subscript  shall  be  attached  to  the  explicitly  indicated 
occurrences  of  x  and  y. 

We  now  list  our  rules  and  axioms,  including  the  new  ones  for  class 
membership. 

Modus  Ponens :     If  P  and  P  D  Q,  then  Q,  where  P  and  Q  are  statements. 

The  axioms  are  any  of  the  statements  listed  below,  with  as  many  uni- 
versal quantifiers  as  desired  attached  on  the  front  (in  particular,  none  need 
be  attached,  and  a  given  one  can  be  attached  more  than  once  if  desired). 
It  is  understood  that  P,  Q,  and  R  denote  statements  and  x,  y,  and  z  denote 
variables,  and  the  statements  need  not  be  distinct  and  the  variables  need 
not  be  distinct  unless  explicitly  stated. 


1. 

P  D  PP. 

2. 

PQ  3  P. 

3. 

(PD  Q)D  (^(QR)  D  ^(RP)) 

4. 

{x).P  D  Q:D  :{x)  P  D  {x)  Q. 
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5.  P  D  (x)  P,  in  case  there  are  no  free  occurrences  of  x  in  P. 

6.  (x)  P  D  {Sub  in  P:  y  for  a;},  in  case  there  is  no  confusion  of  bound 
variables  in  the  substitution  indicated  by  { Sub  m  P:  y  for  x } . 

7.  (x,y,z):X  —  y.  D  .x  e  z  D  y  e  z,  in  case  x,  y,  and  z  are  distinct. 

8.  (x)  P  D  {Sub  in  P:  ty  Q  for  x],  in  case  there  is  no  confusion  of  bound 
variables  in  the  substitution  indicated  by  {Sub  in  P:  uy  Q  for  x\. 

9.  (x).P  ^  Q:  D  :ix  P  =  IX  Q. 

10.  LX  P  =  ly  Q,  in  case  Q  is  the  result  of  replacing  each  free  occurrence 
of  a:  in  P  by  an  occurrence  of  y,  and  P  is  the  result  of  replacing  each  free 
occurrence  oi  y  in  Q  by  an  occurrence  of  x. 

11.  (Eix)  P:  D  :(x).LX  P  =  X  =  P. 

12.  (Ey)(x).x  e  y  =  P,  in  case  x  and  y  are  distinct,  there  are  no  occur- 
rences of  y  in  P,  and  P  is  stratified. 

Our  convention  stated  at  the  end  of  Chapter  VI  would  assure  the  dis- 
tinctness of  X,  y,  and  z  in  Axiom  scheme  7  and  the  distinctness  of  x  and  y 
in  Axiom  scheme  12.  However,  in  stating  our  axioms,  it  seemed  safest  to 
be  overly  explicit  in  order  to  emphasize  that  distinctness  is  required  only 
when  it  is  stated. 

We  see  that  we  have  added  only  one  new  axiom  scheme,  namely,  Axiom 
scheme  12.  However,  we  have  replaced  Axiom  schemes  7A  and  7B  by  the 
single  Axiom  scheme  7.    More  importantly,  we  have  taken  A  —  B  to  he 

(x).x  e  A  ^  X  €  B. 

This  gives  us  immediately  the  theorem 

(y,z):.(x).x  e  y  =  X  €  z:  D  :y  =  z. 

If  we  should  wish  to  keep  A  =  P  as  an  undefined  concept,  we  should  have 
to  retain  Axiom  schemes  7A  and  7B  and  add  the  theorem  above  as  an 
Axiom  scheme  13. 

It  might  be  thought  that  in  Axiom  scheme  12  we  should  specify  that  x 
should  occur  free  in  P,  so  that  P  will  be  a  condition  on  x.    However,  since 

^  P  ^  (x  =  x)P 

for  each  P,  one  can  easily  deduce  our  form  of  Axiom  scheme  12  from  a  form 
in  which  P  is  required  to  have  free  occurrences  of  x,  and  so  we  take  our 
form,  since  it  is  simpler. 

The  first  order  of  business  is  the  proof  of  Axiom  schemes  7A  and  7B. 
^Theorem  IX.2.1.     \-  (x).x  =  x. 

Proof.     By  the  definition  of  equality,  this  is  just 

\-  (x,y).y  ex  =  y  ex, 
which  is  an  easy  consequence  of  [-  P  =  P. 
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**Theorem  IX.2.2.     \-  {x,y).x  =  y  =  y  =  x. 
Proof.     This  is  just 

|-  (x,y):.(z).z  e  X  ^  Z  i  yi  =  :(z).z  e  y  =  z  e  x. 

Theorem  IX.2.3.     \-  (x,y,z):x  =  y.  D  .x  e  z  =  y  e  z. 
Proof.     By  Axiom  scheme  7, 

(1)  \-x  =  y.  D.xezDyez. 

(2)  \-y  =  x.  D.yezDxez. 
By  (2)  and  Thm.IX.2.2, 

(3)  \-x  =  y.  D.yezDxez. 

Our  theorem  follows  from  (1)  and  (3). 
Axiom  scheme  7A  says  that 

(x,y):x  =  y.  D  .F(x,y,x)  D  F(x,y,y) 

with  certain  restrictions  on  bound  variables.  We  now  prove  a  slightly 
stronger  statement. 

**Theorem  IX.2.4.  If  x,  y,  and  z  are  distinct  variables,  and  F{x,y,z)  is  a 
statement,  and  F(x,y,x)  and  F{x,y,y)  are  the  results  of  replacing  each  free 
occurrence  of  z  in  F{x,y,z)  by  occurrences  of  x  and  y,  respectively,  and  if 
this  replacement  causes  no  confusion  of  bound  variables,  then 

h  (x,y):x  =  y.D  .F(x,y,x)  ^  F(x,y,y). 

Proof.     Proof  by  induction  on  the  number  of  symbols  in  F(x,y,z). 

If  F(x,y,z)  contains  only  one  symbol,  it  is  not  a  statement,  and  the 
theorem  holds.  Assume  the  theorem  for  all  F(x,y,z)  with  n  or  fewer 
symbols,  and  let  F(x,y,z)  have  w  +  1  symbols.  By  the  definition  of  a 
statement,  there  are  four  main  cases. 

Case  1.     F(x,y,z)  is  A  e  B. 

Lemma.     [-  x  =  y.  D  .{Sub  in  ^  :  a:  for  z]  =  {Sub  in  A:y  for  z}. 

(i)  Let  A  contain  no  free  z's.  Then  the  lemma  takes  the  form  \-x  —  y  D 
A  =  A,  which  is  easily  proved. 

(ii)  Let  A  he  z.  Then  the  lemma  takes  the  form  \-x  =  yDx  =  y, 
which  is  easily  proved. 

(iii)  Let  A  contain  free  occurrences  of  z,  and  be  lv  G(x,y,z).  Then  v  is 
distinct  from  z,  since  A  contains  free  occurrences  of  z.  Also  v  is  distinct 
from  both  x  and  y,  since  otherwise  there  would  be  confusion  of  bound 
variables  in  forming  F(x,y,x)  or  F{x,y,y).  By  the  hypothesis  of  the  in- 
duction 

yx  =  y.D  .G(x,y,x)  =  G(x,y,y). 
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By  rule  G  and  Thm.VI.6.6,  Part  IX, 

^x  =  y.  D  :iv).G{x,y,x)  =  G(x,y,y). 

By  Axiom  scheme  9, 

\-x  =  y  D  LV  G{x,y,x)  =  iv  G{x,y,y), 

which  is  the  lemma,  since  v  is  distinct  from  z. 
In  the  same  manner  exactly,  we  prove 

(1)  \-  X  —  y.  D  .{Sub  in  B:  X  ioT  z\  =  {Sub  in  B:  y  ior  z}. 
By  Thm.IX.2.3  and  our  lemma, 

(2)  \-x  =  y.D.{S\ihinA:xiorz]e{^uhinB:xioTz} 

=  {Sub  in  A:  y  for  z}  e  {Sub  in  B:  x  for  z}. 

By  the  definition  of  equality,  (1)  is 

\-  X  =  y:  D  :(w).w  e  {Sub  in  B:  x  for  z} 

^  w  e  {Sub  in  B:  y  for  z]. 

Hence  by  Axiom  scheme  6  or  8 

\-  X  =  y.  D  .{Sub  in  A:y  for  z}  e  {Sub  in  B:  x  for  z] 

=  {Sub  in  A:  ?/  for  z}  e  {Sub  in  B:  y  for  2}. 

From  this  and  (2),  we  get 

\^  X  ^  y.  D  .{Sub  in  ^:  a:  for  z]  e  {Sub  in  B:  x  for  z} 

=  { Sub  in  A:y  for  2 }  e  { Sub  in  B:y  for  z}. 
This  is  just 

l-a;  =  y.  D  .F(x,y,x)  =  F{x,y,y). 

Case  2.     F(x,y,z)  is  (z;)  G{x,y,z). 

Subcase  1.  Let  (i")  G{x,y,z)  contain  no  free  2's.  Then  our  theorem  takes 
the  form  j- a;  =  y.  D  .(v)  G{x,y,z)  ^  (v)  G{x,y,z). 

Subcase  2.  Let  (y)  G(x,y,z)  contain  free  ^'s.  Then  v  is  distinct  from  z. 
Also  z;  must  be  distinct  from  both  x  and  y,  else  there  would  be  confusion  of 
bound  variables  in  forming  F{x,y,x)  or  F{x,y,y). 

By  the  hypothesis  of  the  induction, 

[:x  =  y.  D  .Gix,y,x)  =  G{x,y,y). 
By  rule  G  and  Thm.VI.6.6,  Part  IX, 

yx  =  y-.'D  i{v).G{x,y,x)  =  G{x,y,y). 
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By  Thm.VI.5.3, 

]rx  =  y.  D  .F(x,y,x)  =  F{x,y,y). 

Case  3.     F(x,y,z)  is  '^G{x,y,z). 
By  the  hypothesis  of  the  induction, 

^x  =  y.D  .G(x,y,x)  =  G{x,y,y). 

By  h  -P  =  Q.  3  .^P  =  ~Q,  we  get 

^x  =  y.  D  .F(x,y,x)  =  F(x,y,y). 

Case  4.  F(x,y,z)  is  G(x,y,z)&H(x,y,z).  Proof  analogous  to  that  of  case  3. 
**Theorem  IX.2.5.  If  x,  y,  and  z  are  distinct  variables,  and  A{x,y,z)  is  a 
term,  and  A{x,y,x)  and  A{x,y,y)  are  the  results  of  replacing  each  free  occur- 
rence of  z  in  A{x,y,z)  by  occurrences  of  x  and  y,  respectively,  and  if  this 
replacement  causes  no  confusion  of  bound  variables,  then  |-  {x,y):x  =  y.  D  . 
A(x,y,x)  =  A{x,y,y). 

Proof.     Taking  F(x,y,z)  to  be  A(x,y,x)  =  A{x,y,z)  in  Thm.IX.2.4  gives 

\-x  =  y-.D  :A{x,y,x)  =  A{x,y,x).  =  .A{x,y,x)  =  A(x,y,y). 
However,  by  Thm.IX.2.1  and  Axiom  scheme  6  or  8, 

[-  A(x,y,x)  =  A{x,y,x). 

Note.  Thm.IX.2.4  allows  us  to  substitute  a  quantity  for  its  equal  (more 
precisely,  to  substitute  one  name  of  a  quantity  for  another  name  of  the 
same  quantity)  in  a  statement.  Thm.IX.2.5  allows  us  to  make  a  similar 
substitution  in  a  formula.  Thus,  from  Thm.IX.2.4,  we  could  get  such 
results  as 

X  =  y,  X  <  z^y  <  z, 

P  =  Q,  P  in  circle  C  [-  Q  in  circle  C. 

On  the  other  hand,  from  Thm.IX.2.5,  we  get  such  results  as 

X  =  y\~x^  =  y^, 

X  =  y  \-  e''  =  e". 

We  can  easily  get  generalizations  such  as 

x  =  y,u  =  v\-x-\-u  =  y-{-v 

by  using  Thm.IX.2.5  twice  to  get 

x  =  y\-x-\-u  =  y-\-u, 

u  =  v\-y-^u  =  y-{-v. 


Sec.  2]  CLASS  MEMBERSHIP  217 

When  we  proved  the  substitution  theorem,  we  said  that  we  were  proving 
only  a  special  case  and  would  later  prove  the  theorem  in  full  generality. 
The  time  has  come  to  do  so. 

As  before,  we  have  a  complicated  hypothesis  if  we  wish  to  state  the 
theorem  with  precision. 

Htjpothesis  Ha.  P^,  P2,  .  .  .  ,  Pn,  Q,  R  are  statements,  A-^,  A2,  .  .  .  ,  A„ 
are  terms,  Xi,  X2,  .  .  .  ,  Xa  are  variables,  and  y  is  a  variable  distinct  from  all 
the  x's,  and  not  occurring  in  any  of  Pi,  P2,  .  .  .  ,  Pn,  Q,  R,  A^,  A2,  .  .  .  ,  A^. 
U  is  a,  statement  built  up  out  of  some  or  all  of  y  ey,  Pi,  P2,  .  .  .  ,  P„,  Q,  R, 
Ai,  A2,  .  .  .  ,  A„,  Xi,  X2,  .  .  .  f  Xa  by  means  of  (,),  ~,  &,  t,  and  e,  where  each 
part  listed  may  be  used  as  often  as  desired.  W  and  V  are  the  results  of 
replacing  all  occurrences  oi  y  e  y  in  U  hy  Q  and  R,  respectively. 

Theorem  IX.2.6.  Assume  Hypothesis  H2.  Let  ?/i,  2/2,  •  ■  •  ,  2/6  be  vari- 
ables such  that  there  are  no  free  occurrences  of  any  of  the  x's  in  (y^,  7/2,  ...  , 
y,).Q  =  R.    Then  h  (Vi,  2/2,  ...  ,  yi,).Q  ^  R:  J  :W  ^  V. 

Proof.  Proof  by  induction  on  the  number  of  symbols  in  U.  Temporarily 
let  X  denote  (y^,  y2,  ■  ■  ■  ,  yb).Q  =  R- 

If  U  has  only  one  symbol,  it  is  not  a  statement,  and  the  theorem  is  true. 

Assume  the  theorem  if  U  has  n  or  fewer  symbols,  and  let  U  have  n  +  1 
symbols.    The  first  five  cases,  namely: 

1.  U  is  y  €  y. 

2.  U  does  not  contain  y  ey. 

3.  U  is  ~7. 

4.  U  is  Y&Z. 

5.  U  is  (x)  Y. 

are  handled  just  like  the  cases  in  the  proof  of  Thm.VI.5.4.    There  remains: 

Case  6.  UisB  eC.  Then  W  and  F  are  2)  e  ^  and  i^  e  G,  where  D  and  E 
are  the  results  of  replacing  all  occurrences  of  y  e  y  in  B  and  C  by  Q,  and 
F  and  G  are  the  results  of  replacing  all  occurrences  ofyeyinB  and  Chy  R. 

Lemma  1.     \- X.  D  .D  =  F. 

In  case  there  are  no  occurrences  of  y  e  y  in  B,  this  follows  from 
Thm.IX.2.1.  So  let  B  contain  some  occurrences  of  y  e  y.  Then  B  must 
have  the  form  lx  Fj,  and  D  and  F  must  be  lx  F,  and  lx  ¥3,  where  F2  and  F3 
are  the  results  of  replacing  all  occurrences  of  ?/  e  ?/  in  Fi  by  Q  and  R,  respec- 
tively. By  the  hypothesis  of  the  induction,  |-  X.  D  .F2  =  F3.  But  there 
are  no  free  occurrences  of  x  in  A'",  so  that  by  rule  G  and  Thm.VI.6.6,  Part 
IX,  \-  X.  D  .(a:).F2  =  F3.  So  by  Axiom  scheme  9,\-  X.  D  .tx  Y2  =  ix  F3. 
That  is,  \-X.'D  .B  =  F. 

Lemma  2.     ^  X.  ^  .E  =  G. 

Proof.     Proof  similar  to  that  of  Lemma  L 
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By  Lemma  1  and  Thm.IX.2.3, 
(1)  ^X:D:DeE  =  FeE. 

By  Lemma  2  and  the  definition  of  equality  [-  X:  D  •.{w).w  eE  =  w  tG. 
So  h  A^  D  :F  eE  =  F  eG.  Hence  by  (1),  \- X:  D  :D  e  E  =  F  e  G.  That 
is,  h  X:   D   :W  =   V. 

Theorem  IX.2.7.     Assume  Hypothesis  Ha.    li\-Q  =  R,  then  \-W  =  V. 

Proof.     Proof  Hke  that  of  Thm.VL5.5. 

Theorem  IX.2.8.     Assume  Hypothesis  H2.    li  \-  Q  ^  R  and  \-  W,  then 

In  this  connection,  there  are  two  other  useful  results.  We  first  define 
Hypothesis  H3. 

Hypothesis  H3.  Same  as  Hypothesis  H2  except  that  U,  V,  and  W  are 
terms. 

Theorem  IX.2.9.  Assume  Hypothesis  H3.  Let  yi,  y2,  .  .  .  ,  Vb  be  vari- 
ables such  that  there  are  no  free  occurrences  of  any  x's  in  (yi,  yz,  .  .  .  ,  yb). 
Q^R.    Then  \-  {y„  y„  .  .  .  ,  yb).Q  =  /?:  D  J^  =  F. 

Proof.  Choose  w  a  variable  distinct  from  each  of  the  a;'s  and  distinct 
from  y  and  not  occurring  in  any  of  Pi,  Po,  ■  ■  .  ,  Pn,  Q,  R,  -4i,  A2,  .  .  .  ,  A„. 
Then  by  Thm.IX.2.6,  \- (yi,  y^,  .  .  .  ,  yb).Q  ^  R:  3  :weW  =  weV.  So  by 
rule  G  and  Thm.VL6.6,  Part  IX, 

h  (2/,,  y2,  ■  ■  ■  ,  yb).Q  =  R:  D  :(w).w  eW  =  W  eV. 
This  is 

^(yuy2,...,y.).Q^R:  ^  :W  =  V. 

Theorem  IX.2.10.     Assume  Hypothesis  H3.    If  ^  Q  ^  72,  then  \-W  =  V. 

By  use  of  the  substitution  theorem  and  either  Thm.VI.6.8  or  Axiom 
scheme  10,  one  can  in  any  statement  S  replace  any  a:-bound  part  (x)  F{x)  or 
LX  F(x)  by  (y)  F{y)  or  Ly  F(y),  where  y  is  a  new  variable  not  previously 
appearing  in  *S.  If  one  repeats  this  for  all  a;-bound  parts  of  S  for  the  various 
x's  for  which  there  are  a:-bound  parts  of  S,  one  will  eventually  arrive  at  an 
equivalent  form  of  S  in  which  no  bound  variable  is  the  same  as  a  free  vari- 
able, and  no  a:-bound  part  will  contain  the  same  bound  variables  as  any 
other  nonoverlapping  part  which  is  bound  by  a  variable.  Likewise,  given 
any  term  A,  we  can  find  another  term  B  such  that  \-  A  =  B  and  in  B  no 
bound  variable  is  the  same  as  any  free  variable,  and  no  .r-bound  part  will 
contain  the  same  bound  variables  as  any  other  nonoverlapping  part  which 
is  bound  by  a  variable. 

These  possibilities  greatly  widen  the  applicability  of  theorems  such  as 
Thm.IX.3.5  below,  since  preparatory  to  the  use  of  such  a  theorem  we  can 
change  all  bound  variables  to  something  else  and  afterward  change  back 
again,  subject  always  to  avoiding  confusion  of  bound  variables. 
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EXERCISES 
IX.2.1.     Test  for  stratification: 

(a)  La  ((x)  X  e  a)  e  la  {(x)  x  e  a). 

(b)  (.a  ((x).x  €a  =  '^X€l3)eLa  i(x).x  e  a  =  '^  x  e  ^). 

(c)  y  t  la  {;x):X  e  a.  =  .X  —  y. 

(d)  X  e  a.  =  .X  e  fi&X  e  y. 

(e)  X  e  0.  =  .X  e  i,a  (y).y  e  a  ^  ^  €  y. 

IX.2.2.  Let  a  and  /3  be  restricted  variables,  both  subject  to  the  restric- 
tion K{a).  State  what  conditions  must  be  satisfied  if  (iS)./3  =  la  (a  =  /3) 
is  to  be  stratified. 

3.  Formalism  for  Classes.  The  notion  of  the  class  (if  any)  determined 
by  a  condition  F(x)  is  basic  in  all  of  mathematics.  As  we  indicated  above, 
if  a  is  such  a  class,  then 

(x).x  e  a  =  F(x). 

Clearly  such  a  class  is  unique,  for  if  also 

{x).x  €  13  =  F(x), 
then  we  infer 

(x).x  e  oc  =  x  e  13, 
which  is  a  —  0. 

Under  the  circumstances,  a  suitable  notation  for  the  class  (if  any) 
determined  by  a  condition  F(x)  is 

la  (x).x  €  a  =  F{x). 

We  shall  denote  the  class  determined  by  F{x)  by  xF(x).  Specifically, 
we  set  forth  the  following  definition. 

If  x  is  a  variable  and  P  is  a  statement,  then  xP  shall  denote 

La  (x).x  €  a  =  P, 

where  a  is  a  variable  which  is  distinct  from  x  and  does  not  occur  at  all  in  P. 

xP  is  to  be  considered  as  the  class  of  all  a:'s  which  make  P  true.  That  is, 
xP  is  the  class  determined  by  P.  Clearly  xP  can  function  as  this  class  only 
in  case  there  is  such  a  class.  If  there  is  no  class  determined  by  P,  then  we 
may  consider  xP  as  meaningless.  Actually,  if  P  determines  no  class,  then 
we  have  '^(Eia)(x).x  e  a  =  P,  and  xP  would  have  whatever  meaning  we 
have  agreed  to  give  to  La  (x).x  e  a  =  P  under  such  a  circumstance.  One 
possibility  is  that  no  meaning  at  all  is  assigned,  but  this  does  not  prevent 
us  from  making  formal  use  of  xP. 

Note  that  P  need  not  contain  any  occurrences  of  x  in  order  to  give  a 
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meaning  to  xP.  If  P  contains  no  free  occurrences  of  x,  then  xP  is  the  class 
of  all  objects  in  case  P  is  true  and  the  class  of  no  objects  in  case  P  is  false. 

xP  is  commonly  read  ''x  hat  P"  or  "x  roof  P."  In  a  rough  way,  it  sym- 
bolizes the  idea  of  collecting  under  one  roof  all  x's  which  make  P  true. 

All  occurrences  of  x  and  a  in  xP  are  bound.  Occurrences  of  other  vari- 
ables are  free  or  bound  in  xP  according  as  they  are  free  or  bound  in  the 
part  P. 

xP  is  stratified  if  and  only  if  P  is.  If  xP  is  stratified  and  P  contains  free 
occurrences  of  x,  the  type  of  xP  must  be  exactly  one  higher  than  the  type 
of  the  free  occurrences  of  x  in  P.  This  accords  with  the  basic  tenet  of  the 
theory  of  types,  according  to  which  a  class  should  be  of  type  exactly  one 
higher  than  its  members.  If  there  are  no  free  occurrences  of  x  in  P,  then 
in  stratifying  xP  the  type  of  xP  is  quite  unrelated  to  the  type  of  any  con- 
stituent of  P. 

As  we  indicated  earlier,  not  every  condition  determines  a  class.  More 
generally,  given  an  arbitrary  x  and  P,  there  may  or  may  not  exist  the  class 
which  we  would  like  to  denote  by  xP.  Note  that  we  are  entitled  to  use  xP 
in  our  symbolic  logic  even  if  there  is  no  class  which  it  denotes,  since  we  do 
not  require  that  our  formulas  have  meaning.  In  order  for  there  to  be  a 
class  denoted  by  xP,  it  is  necessary  and  sufficient  that 

{Y.a){x).X  ea  =  P 

where  a  does  not  occur  in  P  (our  conventions  ensure  that  x  and  a  are 
distinct  variables).  We  shall  abbreviate  this  to  3(fP)  or  3xP,  which  we 
read  as  ''xP  exists." 

In  mathematics,  one  often  finds  E  P  or  {a;  ]  P}  written  to  denote  xP. 

Another  widely  used  notion  is  a  generalization  of  the  class  of  all  a;'s 
determined  by  a  condition  F(x).  In  the  generalization,  we  have  some 
function  f{x)  and  we  ask  for  the  class  of  all  objects  f(x)  got  by  using  a:'s 
which  satisfy  the  condition  F{x).    If  a  is  this  class,  then 

(x):x  €  a.  =  .(^y).x  =  j{y).F{y). 

So  the  class  in  question  would  be 

i(E7/).x  =  j{y).F{y). 

A  typical  example  might  be  the  class  of  all  squares  of  primes.  Here  j{x) 
is  x^  and  F{x)  is  "x  is  a  prime." 

One  can  generalize  still  further  and  have  a  function  f(x,y)  of  two  vari- 
ables and  a  condition  F{x,y)  on  two  variables.  Then  one  can  ask  for  the 
class  of  all  objects  j{x,y)  got  by  using  pairs  of  x  and  y  which  satisfy  the 
condition  F(x,y).    If  a  is  this  class,  then 

{x):x  €  a.  =  .{Ey,z).x  =  f  (y ,z) .F (y ,z) . 
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So  the  class  in  question  is 

£{Ey,z).x  =  f(y,z).F{y,z). 

A  typical  example  might  be  the  class  of  all  sums  of  two  odd  primes.  Here 
f(x,y)  is  X  -{-  y  and  F{x,y)  is  "x  and  y  are  each  odd  primes."  The  famous 
Goldbach  conjecture  is  that  the  class  in  question  consists  of  all  even  num- 
bers greater  than  four. 

In  general,  we  might  have  a  function  /(?/i,  ...  ,  ?/„)  and  a  condition 
P{yi,  •  '  •  ,  Vn),  and  we  might  ask  for  the  class  of  all  objects  /(?/i,  .  .  .  ,  y,) 
got  by  using  n-tuples  y^,  .  .  .  ,yn  which  satisfy  the  condition  F{y^,  ...,?/„). 
In  symboHc  logic,  the  role  of  the  function  /(?/i,  ...,?/„)  would  be  played  by 
a  term  A  with  free  occurrences  oi  yi,  .  .  .  ,  y„  and  the  role  of  the  condition 
F{yi,  .  .  .  ,  yn)  would  be  played  by  a  statement  P  with  free  occurrences  of 
yi,  .  .  .  ,  yn-  Then  if  a  is  the  class  of  all  objects  A  got  by  using  n-tuples 
yi,  .  .  .  ,  y„  which  satisfy  the  condition  P,  we  would  have 

(x):X  e  a.  =  .(Et/j,  .  .  .  ,  y,).x  =  A.P. 

Then  i(Ei/i,  .  .  .  ,  yn).x  —  A.P  is  the  class  in  question. 

Mathematicians  have  a  convenient  notation  for  this  class,  to  wit 

{A  IP}. 

We  shall  adopt  this  notation,  but  we  have  to  add  certain  additional  quali- 
fications to  give  the  notation  a  unique  significance.  This  comes  about  as 
follows.    If  we  write 

{x-\-y\F{x,y)}, 

this  means  the  class  of  all  sums  of  pairs  of  numbers  x  and  y  satisfjdng  the 
condition  F{x,y).    However,  if  we  write 

{x-\-y\F{x)] 

we  then  mean  the  class  of  all  sums  formed  by  adding  to  the  fixed  number  y 
each  X  which  satisfies  the  condition  F{x). 

The  point  is  that  one  can  consider  a:  +  2/  as  a  function  of  two  variables 
X  and  y,  or  as  a  function  of  x  dependent  on  the  parameter  y.    If  we  write 

{x  +  y\P\, 

it  is  quite  necessary  to  know  which  interpretation  of  x  +  ^  is  intended, 
since  in  the  one  case  we  get  a  unique  class,  and  in  the  other  case  we  get  a 
class  dependent  on  a  parameter  y. 

There  seems  to  be  no  established  mathematical  convention  to  cover  this 
point.  However,  common  usage  seems  to  accord  generally  with  the  follow- 
ing convention  (which  we  hereby  adopt) : 

Let  2/1, 2/2,  •  •  .  ,  2/n  be  distinct  variables  constituting  exactly  all  variables 
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which  have  free  occurrences  in  both  of  A  and  P.  Then  A  is  to  be  considered 
as  a  function  of  y^,  yz,  .  .  .  ,  Vn  and  P  is  to  be  considered  as  a  condition  on 
Vif  1/2,  •  •  •  ,  Vn-  Any  other  variables  which  may  have  free  occurrences  in 
either  of  ^  or  P  are  to  be  construed  as  parameters  upon  which  the  class 
{A  I  P}  depends.    That  is,  we  consider  [A  \  P]  as  an  abbreviation  for 

i(E2/i,  y2,  ■  •  ■  ,  yn).x  =  A.P, 

where  y^,  y^,  .  •  .  ,  yn  are  as  stated  above  and  x  is  a  variable  not  occurring 
in  either  of  A  or  P  (this  ensures  that  x  is  not  one  of  y^,y2,  .  .  .  ,  y„).  When- 
ever we  use  the  notation  { .4  |  P ) ,  the  y's  are  understood  to  be  as  stated 
above. 

Clearly  {yl  |  P}  is  stratified  if  and  only  if  (Eyi,  1/2,  ...  ,  yn).x  =  A.P  is 
stratified,  and  if  stratified,  {A  |  P}  must  have  type  one  higher  than  A. 
Hence  {A  |  P{  is  stratified  if  and  only  if  A  =  A.P  is  stratified. 

In  {A  I  P},  all  the  y's  are  bound  as  also  are  the  x  and  a  implicit  in  the 
definition  (not  to  mention  the  bound  variable  implicit  in  the  =  of  a:  =  A). 
Other  variables  are  free  or  bound  in  {A  |  P}  according  as  they  are  free  or 
bound  in  A  and  P.  There  is  nothing  in  the  notation  { A  |  P|  to  remind  the 
reader  that  all  the  y's  are  bound.  It  will  simply  be  necessary  to  cultivate 
the  habit  of  considering  the  {  !  }  notation  as  binding  those  variables  which 
appear  to  occur  free  on  each  side  of  the  vertical  bar. 

If  P  contains  no  free  variables  except  x,  then  xP  has  no  free  variables; 
in  case  it  is  st^-atified,  it  can  be  assigned  an  arbitrary  type,  and  two  occur- 
rences of  it  in  the  same  formula  may  be  assigned  different  types.  If 
yi,y2,  •  •  ■  ,  Vn  are  all  the  variables  which  occur  free  in  either  of  A  or  P,  then 
I A  I  P}  contains  no  free  variables;  in  case  it  is  stratified,  it  can  be  assigned 
an  arbitrary  type,  and  two  occurrences  of  it  in  the  same  formula  may  be 
assigned  different  types. 
**Theorem  IX.3.1.     If  P  is  stratified,  then  \-  3f  P. 

Proof.     Use  Axiom  scheme  12. 
**Corollary.     If  A  =  A.P  is  stratified,  then  h  3{A  |  P}. 

Theorem  IX,3.2.  If  /3  has  no  free  occurrences  in  P,  then  \-  3^xP,:  D 
:.((3):I3  =  xP.  =  .{x).x  e^  =  P. 

Proof.  Temporarily,  let  F{a)  denote  {x).x  e  a  ^  P.  Then  we  easily 
show 

\-  P(q:)P(/3).  D  .(x).X  ea  =  X  efi. 

By  the  definition  of  equality,  this  is 

h  F(a)F(l3)  D  a  =  /3. 

Noting  that  3.vP  is  just  (Ea)  F{a),  we  see  that  if  we  assume  3xP  we  get 

(Ea)  F(a):(a,l3).F(a)F(0)   D  a  ^  0. 
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So  by  Thm.VII.2.1,  (E,q:)  F{a).    So  by  Axiom  scheme  11, 

(a):La  F(a)   =  a.  —  .F(a). 
Hence 

(^):iaF(a)   =  /3.  ^  .F(/3). 
That  is, 

(/3):fP  =  /3.  ^  ./^(/3), 

which  is  the  desired  conchision. 
**Corollary  1.     \-  MP-.  D  :(x).x  e  xP  =  P. 

Corollaxy  2.  If  x  and  /8  have  no  free  occurrences  in  either  of  A  or  P,  then 
h  a{A  I  P\::  D  ::(^):.^  =  {^  |  P}:  ^  ,{x):X  e  ^.  =  .(Et/„  2/^,  .  .  .  ,  yj. 
X  =  A.P. 

**Corollary  3.     If  x  has  no  free  occurrences  in  either  of  A  or  P,  then 
h  a{A  I  P}.:  D  :.(a:):x  e  {A  \  P}.  ^  .{Ey„  y„  .  .  .  ,  y„).x  =  A.P. 

Corollary  4.     \-  3.{A  \  P}:  D  -.P.  D  .A  e  {A\P\. 
**Theorem  IX.3.3.     \-{x).P  ^  Q:  D  dP  =  xQ. 

Proof.  Choose  a  a  variable  which  does  not  occur  in  either  P  or  Q.  Then 
by  Thm.VI.5.4, 

\-  {x).P  =  Q.:  D  :.(x).x  ea  =  P:  =  :{x).x  e  a  =  Q. 

So  by  rule  G  and  Thm.VI.6.6,  Part  IX, 

\-  {x).P  ^  Qr.  D  ::{a):.ix).x  e  a  =  P:  ^  :(x).x  e  a  ^  Q. 
Then  by  Axiom  scheme  9, 

\-  {x).P  =  Q.:  D  :.La  {x).x  e  a  ^  P:  =  ua  (x).x  e  a  ^  Q. 

Corollary  1.    \- (y^,  y^,  ■  ■  ■  ,  Vrd-P  ^  Q:  :>  :{A  \  P}  =  {A\Q}. 

Corollary  2.     If  there  are  free  occurrences  of  x  in  P,  then  \-xP  =  {x\  P}. 

Corollary  3.     \- xP  =  {x  \  x  =  x.P}. 
**Theorem  IX.3.4.     \-xP  =  yQ  in  case  Q  is  the  result  of  replacing  each  free 
occurrence  of  a;  in  P  by  an  occurrence  of  y,  and  P  is  the  result  of  replacing 
each  free  occurrence  oi  y  in  Q  by  an  occurrence  of  x. 

Proof.  We  note  that  the  theorem  to  be  proved  can  be  written 
\-  xF{x)  =  yF{y).  Choose  a  variable  a  which  does  not  occur  in  either  F{x) 
or  F{y).    Then  by  Thm. VI. 6.8,  Part  I, 

h  (x).x  e  a  =  Fix):  =  :{y).y  e  a  ^  F{y). 
So 

|-  {a):.{x).x  e  a  =  F(x):  =  :(y).y  e  a  =  Fiy). 

By  Axiom  scheme  9,  |-  xF(x)  =  yF(y). 

Theorem  IX.3.6.  Let  yi,  y2,  •  •  ■  ,  Vn  be  distinct  variables  which  consti- 
tute exactly  all  the  variables  which  have  free  occurrences  in  each  of  A  and 
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P.  Let  Wi,  U2,  .  .  .  ,  Un  be  distinct  variables  which  do  not  occur  in  either  of 
A  or  P.  Let  B  denote  {Sub  in  J. :  Wi  for  yi,  U2  for  2/2,  •  •  •  ,  w„  for  ?/„}  and  Q 
denote  {Sub  in  P:  Ui  for  ?/i,  W2  for  2/2,  .  .  .  ,  w„  for  2/„}.  Then  |-  {A  |  P}  = 
{B\Q]. 

Proof  Hke  that  of  Thm.IX.3.4. 

Clearly,  our  theorem  will  give  such  useful  results  as 

\-\x-\-  y\  F{x,y)]  =  {u -^  v  \  F{u,v)\. 
It  can  be  made  to  give 

y[x  +  y\F{x,y)]  =  [x -\- z\  F{x,z)} 
by  using  it  a  second  time  to  give 

h  {m  +  y  I  F{u,v)\  =  [x  -\-  z\  F{x,z)]. 
Similarly,  by  using  the  theorem  twice,  we  can  get 

\-  \x  +  y\  F(x,y)\  =  {y  -\-  x  \  F{y,x)}. 

The  extension  of  these  to  other  functions  than  x  -\-  y  and  to  several 
variables  is  obvious. 

Theorem  IX.3.6.     \-  (I3,x):x  e  x{x  e  ^).  ^  .x  e  (3. 

Proof.  \-  ix).x  €  13  =  X  e  13.  Hence  by  Thm.VI.7.3,  \-  (Ea){x).x  e  a  = 
x  €  13.  That  is,  |-  Rx(x  e  /3).  Then  the  theorem  follows  by  Thm.IX.3.2, 
Cor.  1. 

Corollary  1.     \-  {^).x{x  e  (3)  =  /3. 

Corollary  2.     |-  {a):.{x).x  e  a  =  P-.  D  -.a  —  xP. 

Proof.     By  Thm.IX.3.3,  [-  {x).x  e  a  =  P:  D  :x{x  e  a)  =  xP. 
■^Theorem  IX.3.7.     Suppose  that  y  is  the  only  variable  which  occurs  free 
in  both  A  and  P  and  u  and  v  are  two  variables  not  occurring  in  A  or  P. 
Then  j-  (w,y):{Sub  in  A:  u  for  y}  =   {Sub  in  A:  v  ior  y}.  D  .u  =  vy.  D  :: 
3.{A  I  P}.:  D  ■..iy):A  e  {A  I  P}.  ^  .P. 

Proof.  Denote  A  by  B(y),  {Sub  in  A:  u  ior  y}  by  B(u),  etc.  Similarly 
denote  P  by  F{y).    Assume 

(1)  (u,v):B{u)  =  B{v).  D  .u  =  V 
and 

(2)  a{A  I  P}. 

Then  by  (1)  and  Thm.IX.2.5, 

(3)  (u,v):B{u)  =  B(v).  =  .u  =  V. 
Now  by  (2)  and  Thm.IX.3.2,  Cor.  3, 

{x):x  e  {A\  P}.  =  .(Ev).x  =  B(v).F(v). 
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So 

B(u)  e  {A  I  P).  =  .{Ev).B{u)  =  B{v).F{v). 

Then  by  (3)  and  Thm.VI.5.4, 

B{u)  e{A\P\.  =  .(Ey).w  -  v.F{v). 
So  by  Thm.VII.1.5,  Part  I, 

B{u)  e  {A  I  P}.  ^  .F(w). 


So 


This  latter  is 


(w):5(w)  €  {^  I  P}.  =  .P(w), 
(2/):5(2/)  6  (^  I  P).  ^  .Fiy). 

{y):A  ,{A  |P}.  =  .P. 


Theorem  IX.3.8.  Suppose  that  yi  and  ^2  are  the  only  variables  which 
occur  free  in  both  A  and  P  and  that  Ui,  U2,  Vi,  V2  are  variables  not  occurring 
in  A  or  P.  If  we  denote  {Sub  in  ^:  Mi  for  yi,  Uz  for  2/2}  by  B(ui,U2)  and 
similarly  for  5(t;i,?;2),  then  |-  (ui,U2,Vi,V2):B(ui,U2)  —  B{vi,V2).  D  .u^  =  Vi. 
U2  =  V2V.   D   ::3{A  I  P}.:   D    :.{yr,y2):A  e  {^  |  P}.   =  .P. 

Proof  similar  to  that  of  Thm.IX.3.7. 

EXERCISES 
IX.3.1.     Prove 

h  3iP.3fQ:  D  :xP  =  xQ.  D  .(x).P  =  Q. 

IX.3.2.  Give  an  example  from  mathematics  for  which  3.{A  |  P }  is  true, 
but  A  €  {A\P}  =  Pis  false. 

IX.3.3.  Prove  that  if  ?/i,  2/2,  .  .  .  ,  y„  are  distinct  variables  which  consti- 
tute all  the  variables  which  have  free  occurrences  in  both  A  and  P  and  if 
the  y's  are  all  the  variables  which  occur  free  in  A,  then 

(a)  \-R{A  \P}  D  E{A\A  €  {A\  P}}. 

(b)  ^:i{A\P}.D  .{A\Ae{A\P}}  =  {A\P]. 

IX.3.4.     Prove  |-  x(x  e  xP)  =  xP. 

IX.3.5.     Prove  that  xP  is  stratified  if  and  only  if  P  is. 

IX.3.6.     Prove  Thm.IX.3.3,  Cor.  2. 

IX.3.7.     Prove  \-  ~3x(~  x  ex). 

IX.3.8.  Prove  |-  (Eia).a  =  x{'^(x  e  x)).  Explain  why  we  cannot  derive 
the  Russell  paradox  from  this  theorem.  Give  a  statement  from  which  the 
Russell  paradox  can  be  derived  and  which,  from  a  careless,  intuitive  point 
of  view,  might  seem  to  say  about  the  same  thing  as  the  statement  given 
above. 
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IX.3.9.     Prove: 

(a)  |-  {x)'Rz{x  =  z). 

(b)  \-  (x,y):x  ^  y.  =  .y  €  z{x  =  z). 

(c)  h  (x,y):x  ^  y.  =  .z(z  =  x)  ^  z(z  =  y). 

IX.3.10.     Prove: 

(a)  h  (^)3i((a):a  e  0.  D  .z  e  a). 

(b)  |-  ((3,x):.x  e  z((a):a  e  ^.  D   .z  e  a):  =  :(«):«  e  /?.  3  .X  e  a. 

(c)  \-  {x):X  =  z((a):a  e  z(z  =  x).  D   .Z  e  ex). 


4.  The  Calculus  of  Classes.     We  define 

V                  for 

x{x  =  x). 

A                   for 

x{x  ^  x). 

A  nB         for 

x{x  e  A&x  €  B). 

AkjB         for 

x(x  e  Awx  e  B). 

A                  for 

x(~  x  e  A). 

A  -  B        for 

A  nB. 

where  in  the  definitions  we  take  a::  as  a  variable  not  occurring  in  A  or  B. 

We  read  Y_SiS  "Vee",  A  as  "lambda",  A  r^  B  as  "A  cap  B",  A  VJ  B  as 
"A  cup  B",  A  as  "A  bar",  and  A  -  B  as  "A  minus  B".  We  refer  to  V  as 
the  universal  class,  A  as  the  null  class,  A  r\  B  as  the  product  (or  logical 
product,  or  common  part,  or  cross  cut,  or  meet,  or  intersection,  or  product 
class)  of  A  and  B,  A  VJ  B  as  the  sum  (or  logical  sum,  or  union,  or  join,  or 
sum  class)  of  A  and  B,  and  A  as  the  complement  (or  negate,  or  comple- 
mentary class)  of  A .  All  objects  are  members  of  V,  no  objects  are  members 
of  A,  the  members  oi  A  r\  B  are  just  those  objects  which  are  in  both  of  A 
and  B,  the  members  of  A.  W  5  are  ju&t  those  objects  which  are  in  at  least 
one  of  A  and  B,  and  the  members  of  A  are  just  those  objects  which  are  not 
in  A.  Rather  commonly  one  writes  AB,  A  -\-  B,  and  —Aior  A  r\B,  A^  B, 
and  A.  We  favor  this  notation  ourselves  but  have  not  felt  free  to  use  it 
because  we  shall  have  to  use  class  sums  and  products  in  the  same  context 
with  cardinal  and  ordinal  sums  and  products,  and  confusion  would  arise  if 
we  were  not  using  the  cap  and  cup  notation.  Other  notations  which  are 
occasionally  encountered  are  1  for  V,  0  for  A,  and  C{A)  for  A. 

V  and  A  contain  no  free  variables.  The  free  variables  oi  A  r\  B  and 
A  \J  B  are  just  those  of  A  and  B  separately.  Certain  bound  variables  are 
inherent  in  the  x  used  to  define  A  r\  B  and  A\J  B,  but  these  variables  are 
distinct  from  any  variables  of  A  or  B.  Any  other  bound  variables  are  those 
of  A  and  B.    Similar  remarks  hold  for  A. 

One  may  think  of  V  as  the  initial  letter  of  "Universal  class"  in  the  classic 
roman  alphabet.    Whitehead  and  Russell  point  to  the  appropriateness  of 
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reversing  the  symbol  for  the  universal  class  to  get  the  symbol,  A,  for  the 
null  class.  Also  they  call  attention  to  the  fact  that,  when  the  symbol  is 
denoting  the  universal  class,  it  is  turned  so  as  to  hold  the  maximum  amount, 
whereas  when  it  is  denoting  the  null  class,  it  is  turned  so  as  to  hold  nothing. 
Also  they  note  that  the  cup  used  in  forming  the  sum  of  two  classes  is 
turned  so  as  to  hold  the  maximum  amount,  which  agrees  with  the  definition 
of  the  sum  class. 

V  and  A  are  stratified  and,  since  they  contain  no  free  variables  at  all,  can 
be  assigned  any  types  whatever.  Two  V's  in  the  same  formula  can  be 
given  different  types  if  desired.  Similarly  for  two  A's,  or  a  A  and  a  V.  To 
stratify  a  statement  containing  A  r\  B,  it  will  turn  out  that  A  and  B  have 
to  have  the  same  type,  which  must  be  taken  as  the  type  oi  A  r^  B.  More- 
over, to  stratify  A  n  B  it  is  necessary  and  sufficient  that  we  can  stratify 
each  of  A  and  B  separately  in  such  a  way  that  the  subscripts  attached  to 
free  occurrences  of  any  variable  in  A  are  the  same  as  the  subscripts  at- 
tached to  free  occurrences  of  this  same  variable  in  B.  That  is,  A  r\  B  is 
stratifiedjf  and  only  if  ^  =  B  is  stratified.  Similarremarks  hold  for  yl  VJ  B. 
Finally,  A  is  stratified  if  and  only  if  A  is,  and  if  A  is  stratified,  it  has  the 
same  type  as  A. 

In  trying  to  picture  the  relations  of  the  class  calculus,  it  is  helpful  to  use 
Venn  diagrams.    Figure  IX.4.1  is  such  a  diagram.    The  points  in  the  inte- 
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Fig.  IX.4.1. 
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nor  of  the  large  square  are  supposed  to  represent  the  universe  of  discourse; 
the  members  of  V,  in  other  words.  The  points  in  the  region  with  horizontal 
shading  represent  members  of  A,  and  the  points  in  the  region  with  vertical 
shading  represent  members  of  B.  The  points  in  the  region  with  both  kinds 
of  shading  represent  members  of  ^  r\  B.  The  points  in  the  entire  region 
which  has  the  shading  of  some  kind  represent  members  oi  A  ^  B.  The 
points  in  the  region  with  no  horizontal  shading  represent  members  of  A, 
and  the  points  in  the  region  with  no  vertical  shading  represent  B. 

By  drawing  a  Venn  diagram  with  three  classes,  A,  B,  and  C,  one  can 
verify  the  commutative,  associative,  distributive,  and  other  principles 
which  we  shall  shortly  prove.  Also  the  Venn  diagram  is  a  useful  device 
in  recalling  these  various  principles. 

Theorem  IX.4.1. 

I.  |-  3i(x  =  x). 
II.  h  3^(a^  ^  a;). 

III.  \-  (a,^).3f  (x  €  a&x  e0). 

IV.  h  {a,0).Mlx  e  av.r  e  /3). 
V.  h  (a).3i(~a;  e  a). 

Proof.     Use  Thm.IX.3.1. 

Note  that  our  conventions  ensure  the  distinctness  of  a,  13,  and  x  in  Parts 
III  and  IV,  and  of  a  and  x  in  Part  V. 

Theorem  IX.4.2. 
*I.  |-  (x):X  eY.  =  .X  =  X. 

*II.   [-  (x):X  €  A.  =  .X  9^  X. 
**III.  \-  (a,^,x):X  e  a  r\  ^.  =  .X  e  a&X  e  ^. 
**IV.  \-  (a,^,x):X  e  a  yj  13.  =  .X  e  awx  e  /3. 
**V.  |-  (a,x):X  e  a.  =  .'~  X  e  a. 

Proof.     Use  Thm.IX.4.1  and  Thm.IX.3.2,  Cor.  1. 

Using  Thm.IX.4.1,  Part  III,  one  can  prove  the  existence  of  classes  de- 
termined by  unstratified  conditions.  For  instance,  use  Axiom  scheme  8 
to  replace  ahy  y{8  e  y),  and  then  use  Axiom  scheme  6  to  replace  ^  by  8. 
There  results 

(1)  \-  3.x{x  e  y(8  e  y)&x  e  8). 
Now 

(2)  X  e  y(8  e  y)&x  e  8 

is  unstratified;  for  suppose  we  attach  the  subscript  unity  to  the  two  occur- 
rences of  8.  Then  in  a;  e  5  we  must  attach  the  subscript  zero  to  x.  In  8  ey 
we  must  attach  the  subscript  2  to  y.  Then  we  must  consider  y{8  e  y)  of 
type  3  (see  the  definition  of  xP).  Hence  in  the  part  x  ey{8  ey),\ye  have  a 
type  difference  of  3  instead  of  unity. 
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Nevertheless,  though  (2)  is  not  stratified,  (1)  asserts  that  (2)  determines 
a  class. 

It  might  be  feared  that  this  possibility  of  determining  classes  by  certain 
types  of  unstratified  conditions  would  lead  to  a  contradiction  of  some  sort, 
but  apparently  it  does  not. 

One  could  avoid  this  means  of  determining  classes  by  means  of  unstrati- 
fied conditions  by  putting  additional  restrictions  on  Axiom  scheme  6  and 
Axiom  scheme  8.  For  instance,  one  could  require  that  {Sub  in  P:  y  ior  x} 
in  Axiom  scheme  6  and  { Sub  in  P:  ly  Q  for  x }  in  Axiom  scheme  8  should 
be  stratified.  A  less  stringent  requirement  would  be  to  require  in  Axiom 
scheme  6  that  { Sub  in  P:y  for  x }  be  stratified  if  P  is,  and  in  Axiom  scheme  8 
that  (Sub  in  P:  ly  Q  for  x}  be  stratified  if  P  is. 

As  long  as  we  get  into  no  difficulties  by  allowing  the  determination  of 
classes  by  special  unstratified  conditions,  we  welcome  the  flexibility  which 
this  allows  us.  We  hold  in  reserve  the  additional  requirements  on  Axiom 
schemes  6  and  8  in  case  they  are  ever  needed  to  prevent  a  contradiction. 
**Theorem  IX.4.3.  Let  x,  ai,  a2,  .  .  .  ,  a„  be  distinct  variables.  Let  A  be 
built  up  out  of  the  a's  by  use  of  n,  W,  and  .  Let  P  be  the  result  of  replac- 
ing a,-  by  X  €  at,  n  by  &,  W  by  v,  and  by  ~  in  A.  Then  \-  (ai,  a^,  . . .  , 
a„)(x).x  e  A  =  P. 

Proof.  We  note  that  Thm.IX.4.2,  Parts  III,  IV,  and  V  are  special  cases 
of  this  theorem,  and  furnish  just  the  lemmas  needed  to  prove  the  theorem  by 
induction  on  the  number  of  symbols  in  A. 

When  A  and  P  are  related  as  in  this  theorem,  we  say  that  P  is  the 
statement  analogous  to  the  class  A  and  A  is  the  class  analogous  to  the 
statement  P. 

Thm. IX.4.3  gives  us  a  very  potent  means  of  proving  equalities  between 
classes.  Let  A  and  B  be  built  up  out  of  some  or  all  of  the  distinct  variables 
Q!i,  a2,  .  .  .  ,  a„  by  use  of  r\,  W,  and  .  Let  P  and  Q  be  the  analogous  state- 
ments. P  and  Q  will  then  be  statements  of  the  statement  calculus,  whose 
constituent  parts  are  of  the  form  x  e  a,.  If  one  can  prove  \-  P  ~  Q,  then  by 
Thm. IX. 4. 3  we  shall  have  \-  x  e  A  =  x  e  B,  whence  we  get  \~  A  =  B.  In 
case  [-  P  =  Q  is  proved  by  truth  values,  we  shall  say  that  |-  ^4.  =  P  is  proved 
by  truth  values. 

Theorem  IX.4.4. 

^^I.  \-  (a)  a  =  a. 


••II 
••III 

IV 

V 

VI 

VII 
••VIII 


\-  (a)  a  =^  a  r\  a. 

\-  (a)  a  =  a  ^  a. 

h  (a,^)  a  =  a  n  (a  W  j8) . 

\-  (a,/3)  a  =  a  W  (a  n  ^) . 

\-  (a,^)  a  =  (a  n  i8)  W  (a  n  ^) . 

f-  (q:,/3)  a  =  (a  U  /3)  n  (a  W  /3) . 

\-  (a,/3)  a  n  )8  =  /8  n  a. 
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**IX.  (^  (a,jS)  a  W  /?  =  /3  W  a. 

**X.  h  (a,l3,y)  a  n  (/3  n  7)  =  («  ^  /3)  ^  7. 
**XI.  fr  (a,/8,7)  ct  W  (/?  W  7)  =  («  W  iS)  W  7. 
**XII.  h  (a,l3,y)  anil3\Jy)  =  (a  H  /S)  U  (a  H  7)- 
**XIII.  h  (a,/3,y)  a\J  (13  r^  y)  =  (a  W  iS)  n  («  W  7). 
XIV.  [-  (a,/3)  a  W  ^  =  ang. 
XV.  \-  (a,/3)  an/3  =  a  W  ^. 
XVI.  h  (a,/3)  g  W  ^  =  g  n  /3. 
XVII.  h  (a,/3)  g  n  |S  =  a  U  /3. 
XVIII.  h  (g,/3)  an^  =  (a\J  §)  rM3. 
*XIX.  h  (g,/3)  g  W  /?  =  (a  n  ^)  W  /3.       _ 

XX.  h  («,^)  a  W  ,i3  =  (g  n  /3)  W  (g  n  ^)  W  (5  n  /3). 
XXI.  h  (g,^)  gn/3  =  (g  W  ^)  n  (g  W  /3)  n  (g  W  ^). 
XXII.  h  (g,/3)  anj  =  (ar\0  U  (a  rM3)  U(an  §). 
XXIII.  h  (g,/3)  g  W  /?  =  (g  W  ^)  n  (g  W  ^)  n  (g  VJ  /3) . 
**XXIV.  [-(«)«  ^«  =  A. 
**XXV.  \-{a)(xyJa  =   Y. 
**XXVI.  I-  A  =  V. 
**XXVII.  h  V  =  A. 
**XXVIII.  \-  (a)  a  n  Y  =  a. 

**XXIX.  h  («)  «  W  V  =  V. 
**XXX.  K«)  «  ^  A  =  A. 
**XXXI.  h  («)  «  U  A  =  g. 

Proof.     All  except  the  last  eight  parts  are  proved  by  truth  values.    To 
illustrate  a  proof  by  truth  values,  we  give  the  proof  of  Part  XX. 

The  statements  analogous  to  g  W  ,8  and  (g  n  /3)  VJ  (g  n  /3)  W  (g  n  /3)  are 

X  €  awx  e  /3 
and 

X  e  a.x  €  iS-.w.x  €  g.'^  x  e  iS.-v:'^  rc  e  a.rc  €  /?. 

By  truth  values 

\-  X  e  avx  €  /3.:  =  :.x  e  a.x  e  l3:w:X  e  g.-^  o:  e  /3:v:'~  a:  e  g.a;  e  fi 

(see  Thm.VI.6.1,  Part  XXXVII). 
So  by  Thm.IX.4.3, 

|-  a:  €  g  W  /?.  =  .a:  e  (g  n  /3)  W  (g  n  is)  U  (g  n  ,8). 
So 

h  a  w  /3  =  (g  n  /3)  W  (g  n  /3)  u  (5  n  ,3). 

By  Parts  XXIV  and  XXV  we  can  write  the  last  six  parts  in  the  forms 


XXVI.  \-(a)  a  r^a  =  a  yj  a. 
XXVII.  \-{a)  a^a  =  a  r\a. 
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XXVIII.  h  (a,j8)  a  n  (/3  W  ^)   =  a. 
XXIX.   \~  (a,0)  a  W  (/3  W  ^)   =  /3  W  ^. 

XXX.  \-  (a,i3)  ar\(^  n0)  =  0  r^(3. 
XXXI.  h  («,^)  «  W  (j8  n /3)  -  «. 

In  these  forms,  they  can  be  proved  by  truth  values.  This  leaves  us  with 
only  Parts  XXIV  and  XXV  to  prove.    For  these  we  need  merely 

\-  X  e  cx.y.'^  X  e  ai  ^  :X  =  X 
and 

\-  X  e  a.'^  X  i  ai  ^  :X  9^  Xj 

which  follow  easily  by  Thm.VI.6.1,  Parts  LXXI  and  LXXIV. 

Because  we  now  have  the  associative  law  for  r\  and  \J,  we  can  and  will 
hereafter  write  a  n  ^3  n  7  for  either  of  (a  n  ^3)  n  7  and  a  n  (,S  n  7) ;  similarly 
ioYayj^\Jy,ar\^r\yr\8,  a\Jfi\Jy\Jb,  etc. 

Let  A  be  built  u^out  of  V,  A,  and  the  distinct  variables  ai,  0:3,  .••,«„  by 
use  of  r\,  W,  and  .  Then  the  dual  of  yi,  A*,  is  got  from  A  by  simulta- 
neously performing  the  following  changes : 

1.  If  an  occurrence  of  a,-  has  no  bar  over  it,  place  one  over  it  {i.e.,  replace 
ai  hyal). 

2.  If  an  occurrence  of  a,-  has  a  bar  over  it,  replace  the  entire  occurrence 
of  a~  by  a,-. 

3.  Replace  each  r\hy  \J  and  each  \J  by  r\. 

4.  Replace  each  V  by  A  and  each  A  by  V. 

**Theorem  IX.4.5.  If  A  is  built  up  ouU)f  V,  A,  and  the  distinct  variables 
«!,  tta,  •  •  •  ,  «n  by  use  of^n,  W,  and  ,  and  A*  is  the  dual  of  A,  then 
h  (ai,  as,  ...  ,  (Xn)A*  =  A. 

Proof.  Choose  /3  a  variable  distinct  from  any  of  the  a's  and  replace  all 
V's  and  A's  of  A  and  A*  by  13  \J  ^  and  /3  n  /3,  and  call  the  results  B  and  B*. 
Clearly  B*  is  the  dual  of  B.  Also,  by  Thm.IX.4.4,  Parts  XXIV  and  XXV, 
\'A=B  and  \-A*  =  B*.  As  B  and  B*  are  built  up  entirely  from  variables, 
we  can  form  the  analogous  statements,  P  and  P*.  Clearly  P*  is  the  dual 
of  P.    So 


\-p*  ^ 

~P. 

Hence  by  Thm.IX.4.3, 

«!n 

hx 

eB*  ^ 

"^(x 

eB). 

oO 

[-X 

eA*  ^ 

~(x  < 

■:A). 

Accordingly,  by  Thm.IX.4.2,  Part  V, 

[-  x  e  A*  =  x  €  A. 
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So  _ 

\-A*  =  A. 

Several  parts  of  Thm.TX.4.4  are  special  cases  of  this  duality  theorem. 

From  this  duality  theorem,  one  easily  has  another  type  of  duality 
theorem.  Let  A  and  B  be  built  u^out  of  V,  A,  and  the  distinct  variables 
«!,  a2,  .  .  .  ,  ocn  by  use  of  r\,  W,  and  ,  and  suppose  we  have  \-A=B.  Then 
we  have  \~C  =  D,  where  C  and  D  are  got  from  A  and  B  by  repjacing_n  by  W, 
U  by  n,  V  by  A,  and  A  by  V.  For  from  \- A  =  B,we  have  \-  A  =  B.  Then 
by  the  duality  theorem  \-  A*  =  B*.  So  by  rule  G,  \-  (ai,  aa,  •  •  •  ,  ot^A*  = 
B*.  Now  by  Axiom  scheme  8  replace  each  a,-  by  aj.  With  a  few  applica- 
tions of  \-  ai  =  ai,  we  get  \-  C  =  D. 

Many  pairs  of  parts  of  Thm.IX.4.4  will  be  seen  to  be  related  by  this  type 
of  duality. 

Theorem  IX.4.6.     \- (a)  a  9^  a. 

Proof.     By  the  definition  of  equality 


\-  a  =  a.   D   .X  e  a  ^  X  €  a. 


So 


\-  a  =  a:   D   :X  e  <x,  ^  .'■^'  x  e  a. 
However,  by  truth  values 

yP  D  .Q  ^  ^Q:  D  :~P. 

So  \-  '■^(a  =  a). 
^Corollary.     [-  A  ?^  V.  _ 

Theorem  IX.4.7.     \-  («,/?):«  =  ^^D  .a^=  Z?^ 

Proof.     By  Thm.IX.2.5,  \-a  =  ^.  D  .a  =  ^. 

Theorem  IX.4.8.  If  P  contains  free  occurrences  of  a,  then  |-  3  {a  |  P} .:  D 
:.(a):a  e  {a\P].  =  .P. 

Proof.     Use  Thm.IX.3.7  and  Thm.IX.4.7. 

Corollary.  If  P  is  stratified  and  contains  free  occurrences  of  a,  then 
h(a):5  e  {a\  P}.  =  .P. 

Theorem  IX.4.9.     \-  (a,0,y):a  W  /3  =  7.0:  n  /3  =  A.  D  .a  =  7  -  /3. 

Proof.  Assume  a  w  ^  =  7  and  a  n  /3  =  A.  By  Thm JX.4.4,  Part  XIX, 
|-  a  W  (8  =  (a  n  |S)  U  (8.  So  by  ou^;  assumption,  a  \J  ^  =  A  W  ;8,  and  so 
by  Thm.IX.4.4,  Part  XXXI,  a  W  ,3  =  /3.  So  by  Thm.IX.4.4,  Part  VII, 
a  =   (a  \J  0)  n  (a  \J  ^)   =  y  n  ^  =  y  -  ^. 

Corollary  1.     f-  (a,0):a  W  /3  =  V.a  n  /3  =  A.  D  .a  =  j3. 

Proof.     Put  7  =  V. 

Corollary  2.     [-  {a,0):a  n  ,3  =  A.  D  .«  =  (a  W  ,3)  —  /?. 

Proof.     Put  7  =  a  W  /3. 

Corollary  3.     \-  {a,^,y):a  W  |(3  =  7  U  /S.a  n  jS  =  A.7  n  |8  =  A.  D  .a  =  7, 


Sec.  4]  CLASS  MEMBERSHIP  233 

Proof.     By  Cor.  2,  a  =  (a  W  j8)  -  /3  =  (7  W  /3)  -  /?  =  7. 

One  can  also  prove  Thm.IX.4.9  by  use  of  truth  values  as  follows. 

By  truth  values, 

h  PyQ  =  R.PQ  =  S^S.  D  .P  ^  R'^Q. 

Replacing  P  by  x  e  a,  Q  hy  x  e  ^,  R  hy  x  e  y,  and  S  by  x  e  8,  we  get 

\-X€aVJ^  =  xe  y.x  €ar\^  =  X€8r\d.  D   .xea^xey  —  (3. 
Then  by  rule  G,  Axiom  scheme  4,  and  Thm.VI.6.5,  Part  I 

\-  {x).x  €  a\J  ^  =  x  e  y.{x).x  e  a  f^  ^  =  x  e  A:   D   :(x).x  ea  =  xey  —  ^. 

Our  theorem  now  follows  by  the  definition  of  equality. 
Theorem  IX.4.10. 

I.   \-(x)  xe  V. 
*n.  [-  (x)  '^  x  e  A. 

Proof.     Use  Thm.IX.4.2,  Parts  I  and  II. 
Theorem  IX.4.11. 
I.  ^(x)Q:  D  :{x).xeY  ^  Q. 
II.   h  (x)  ~Q:  3  :{x).x  €  A  =  Q. 

Proof  of  I.     By  putting  x  e  V  for  P  in  Thm.VI.6.1,  Part  LXXI,  we  get 
^Q:   D   :X  eV  ^  Q.     So  ^  (x)  Q:   D   :{x).x  eY  ^  Q. 

Proof  of  Part  II  is  similar,  using  Thm.VI.6.1,  Part  LXXIV. 
Corollary  1. 
I.  \-  (x)  Q.  D  .MQ. 
II.  h  (x)  ~Q.  3  .MQ. 

Corollary  2. 
I.  [-  (x)  Q.D.Y  =  £Q. 
II.  h  (x)  ~Q.  3  .A  =  xQ. 

Theorem  IX.4.12. 
I.  |-  (a):.(x).x  e  a:  =  :a  =  Y. 
II.   |-  {a):.{x).'^  X  €  a:  =  :a  =  A. 

Proof  of  I.     By  Thm.IX.4.10,  Part  I,  and  Thm.IX.2.4, 

\-  a  =  Y:   D   :(x)..r  e  a. 

Conversely,  putting  re  e  a  for  Q  in  Thm.IX.4.11,  Part  I,  gives 

\-  (x).x  e  a:  D  :q:  =  V. 

Proof  of  Part  II  is  similar. 
Corollary. 

I.  j-  (a):.(Ea:).~  x  e  a:  ^  -.a  9^  Y. 
*II.   h  lcx):.(Ex).x  ea:  ^  :a  9^  A. 
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We  introduce  the  definitions: 

A  Q  B  for  (x).x  e  A  D  x  e  B, 

A  C  B  for  A  Q  B&A  ^  B, 

A  Q  B  QC        for  AQ  B&B  C  C, 

A  CB  QC        for  AC  B&B  C  C, 

A  QB  =  C        for  A  Q  B&B  =  C, 
etc. 

In  the  first  of  these  we  require  that  a:  be  a  variable  which  does  not  occur 
in  A  or  B.  Clearly,  except  for  the  bound  x,  all  free  occurrences  of  variables 
oi  A  Q  B  are  the  free  occurrences  in  A  and  B,  and  similarly  for  the  bound 
occurrences. 

We  read  A  Q  B  as  "A  is  included  in  B"  or  "B  includes  A"  or  "A  is  a 
subset  of  B."    We  read  A  C.  B  as  "A  is  a  proper  subset  of  B." 

Many  logicians  use  the  symbol  C  to  mean  what  we  denote  by  C .  This 
leaves  them  no  simple  notation  for  what  we  denote  by  C-  Our  notation 
has  general  mathematical  sanction. 

In  the  Venn  diagram,  A  Q  B  would  mean  that  all  the  area  denoted  by  A 
is  comprised  within  the  area  denoted  by  B,  and  A  Q  B  would  mean  that 
moreover  some  of  B  is  not  a  part  of  A. 

A  C.  B  and  A  CZ  B  are  stratified  if  and  only  if  ^  =  B  is  stratified.  In 
particular,  for  stratification  of  ^  Q  B  or  A  Q  B,  A  and  B  must  have  the 
same  type. 

We  now  prove  a  theorem  which  is  very  important  because  it  allows  us  to 
replace  the  inclusion  of  two  classes  by  the  equality  of  two  other  classes, 
and  in  fact  in  four  different  ways.  Thus  all  our  means  of  proving  equalities 
between  classes  become  available  for  proving  inclusions  between  classes. 

Theorem  IX.4.13. 

**I.   |-  (a,l3):a  Q  l3.  =  .a  r\  §_  =  a. 

II.   \-  (a,l3):a  C  ^.  =  .«  n  /3  =  A. 

*III.   [-  (a,(3):a  Q  ^.  =  .a  KJ  /S  =  ^. 

IV.  \-  (a,0):a  C  /3.  =  .«  VJ  /3  =  V. 

Proof  of  Part  I.  By  Thm.VI.6.1,  Part  XLIX,  and  Thm.IX.4.2,  Part  III, 
\-  (x).x  €  a  D  X  €  0:  =  :(x).x  e  a  ^  x  e  a  r\  ^.  By  the  definitions  of  equality 
and  of  Q,  this  is  Part  I. 

Proof  of  Part  III  is  similar,  using  Thm.VI.6.1,  Part  L,  and  Thm.IX.4.2, 
Part  IV. 

Proof  of  Part  II.     By  Parts  I  and  III,  it  suffices  to  prove 

(1)  \-  a  n  0  =  a.   D   .a  n'^  =   A, 

(2)  h  «  ^  ^  =   A.   D   .a  W  /3  =  /3. 
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To  prove  (1),  assume  a  n  ^  =  a.  Then  a  r\  ^  =  (a_r\  (3)  r\  13  =  a  n 
((8  n  jS)  =  a  n  A  =  A.  To  prove  (2)_,  assume  a  n  /3  =  A.  Then  by 
Thm.IX.4.4,  Part  XIX,  a^  ^  =  (a  n  ^)  ^  13  =  A  VJ  (3  =  0. 

Proof  of  Part  IV.     By  Part  II,  it  suffices  to  prove 

(3)  h  «  ^  is  =  A.  =  .«  W  /3  -  V. 

Assume  a  n  /3  =  A.  Then  a  r\  ^  —  K.  That  is,  a  W  /3_=  V.  Conversely, 
if  a  W  (3  =  V,  then  by  reversing  our  steps,  we  get  a  n  /3  =  A. 

Corollary  1.     \-  (/3):V  C  ^.  =  ./3  =  V. 

Proof.     Put  a  =  V  in  Part  I. 

Corollary  2.     \-  {a):a  C  A.  =  .A  =  a. 

Corollary  3.     |-  (a). a  C  a. 

Corollary  4.     |-  {a). a.  C  V. 

Corollary  5.     [-  (/3).A  e  ^8. 
*Corollary  6.     h  («,/5).«  ^  /?  ^  /?. 
^Corollary  7.     |-  (q;,/3).q;  C  «  w  /3. 
**Theorem  IX.4.14.     [-  {a,^):a  C  /3.^  C  «.  =  .«  =  /5. 

Proo/.  By  Thm.VI.6.5,  Part  l,\- {x).x  e  a  D  x  e  ^:{x).x  e^  D  x  €  a.-.  ^  -.. 
{x);x  ta  3  X  e  ^.x  e  (3  D  x  e  a.  By  the  definitions  of  equality  and  inclusion, 
this  is  our  theorem. 

This  theorem  is  very  commonly  used  to  prove  the  equality  of  two  classes. 
**Theorem  IX.4.15.     \-  (a,l3,y):a  C  /3./3  Q  y.  D  .a  Q  y. 

Proof.  Assume  a  C  /5  and  /3  C  7.  Then  by  Thm.IX.4.13,  Part  I, 
a  r\  13  =  a  and  ^  r\  y  =  j3.  So  a  r\  y  =  (a  r\  0)  r\  y  =  a  r\  {l3  r\  y)  = 
a  r\  13  =  a. 

For  an  alternative  proof,  one  can  put  x  e  a,  x  e  (3,  and  x  ey  for  P,  Q,  and 
R  in 

\-  P  D  Q.Q  D  R.  D  .P  D  R. 

Theorem  IX.4.16. 

I.  \-  {a,^,y):a  Q  (3.  D   .a  n  y  Q  ^  r\  y. 
II.   \-  (a,(3,y):a  C  /3.   D   .a  U  y  Q  13  VJ  y. 

Proof  of  Part  I.  Assume  a  C  ^.  Then  by  Thm.IX.4.13,  Part  I, 
a  n  |S  =  a.  So  (a  n  /3)  P\  (7  Pi  7)  =  a  Pi  7.  So  (a  n  7)  n  (/3  Pi  7)  =  a  P  7. 
Accordingly,  a  P  7  ^  /?  P  7. 

Proof  of  Part  II  is  similar  using  Thm.IX.4.13,  Part  III. 

Alternatively  one  can  derive  Parts  I  and  II  from  \-  P  D  Q.  D  .PR  D  QR 
and  y  P  J  Q.  D  .PyR  D  QwR,  respectively. 

Corollary  1.     \-  (a,0,y,8):a  Q  0.y  Q  8.  D  .a  n  y  Q  (3  n  8. 

For  a  Q  13  gives  a  P  7  C  ^  p  7  and  y  Q  8  gives  /?  P  7  C  /3  p  5. 

Corollary  2.     \-  {a,l3,y,8):a  C  0.y  C  8._D  .a  W  7  C  /?  w  5. 

Theorem  IX.4.17.     \-  {a,0):cx  Q  ^.  =  .^  Qa.         

Proof.     Assume  a  Q  ^.    Then  a  r\  ^  =  a.     So  a  P  /3  =  a.    That  is, 
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a\J  fi  =  a.    So  |S  C  a.    Since  these  steps  are  reversible,  we  get  our  theorem. 

Alternatively  one  can  derive  the  theorem  from  [-P  D  Q.  ^  .'^Q  D  '^P. 

Theorem  IX.4.18. 
*I.  [-  (a,l3,y):.a  ^  /3  n  7:  =  :«  ^  ^.a  C  7. 
*II.  f-  l(x,^,y):.a  \J  (3  Q  y.  =  :a  Q  y.l3  Q  7. 

Proof  of  Part  I.  Assume  a  C  ^  n  7.  Combining  this  with  /?  n  7  C  /3 
and  |S  n  7  C  7  (which  we  get  by  Thm.IX.4.13,  Cor.  6),  we  get  a  C  /3  and 
a  Q  y.  Conversely,  assume  a  Q  ^  and  a  £  7.  Then  a  Q  ^  r\  yhy  Thm. 
IX.4.16,  Cor.  1. 

Proof  of  Part  II  is  similar. 

Alternatively  one  can  derive  Parts  I  and  II  from  \-  P  ^  QR:  ^  :P  D  Q. 
P  D  i?  and  h  -PvQ  D  R:  ^  :P  D  R.Q  D  R,  respectively. 

Corollary  1.     \-  (/3,7):.V  =  /3  n  7:  =  :V  =  /3.V  =  7. 

Corollary  2.      \-  (a,^):.a  U  13  =  A:  =  :a  ^  A./5  =  A. 

Theorem  IX.4.19.     \-  {a,^):a  C  /3.  3  .~(/3  Q  «)• 

Proof.  By  Thm.IX.4.14,  ^  a  C  /3:  D  :^  C  «.  D  .«  =  ^.  So  [-  a  C  ^:  D  ; 
a  9^  ^.  D  .~(/3  C  a).  Then  [-  a  C  /3.a:  ?^  (9.  D  .~(^  C  a),  which  is  our 
theorem. 

Theorem  IX.4.20.     \-  {a,0):.a  C  13:  ^  :a  Q  I3.^{I3  Q  a). 

Proof.     By  Thm.IX.4.19, 

(1)  h  «  C  (3:  D  :a  Q  /?.~(/3  C  a). 

By  Thm.IX.4.14,  ^  a  =  13.  D  .13  Q  a.  So  h  ~(/3  ^  «).  D  .a  ^  0.  So 
|-  a  C  /3.'~(/3  C  q:).  D  .a  C  /3-    Combining  with  (1)  gives  our  theorem. 

Theorem  IX.4.21.     h  («,/3):.«  C  iS:  =  :«  C  /3.(Ea:).a;  e  ,S.~  .r  e  a. 

Proof.     By  the  duality  theorem,  |-  '~(/3  ^  a):  =  :(Ea;).a;  e  jS.'^  x  e  a. 

Theorem  IX.4.22. 

I.  \-  (a,f3,y):a  C  /3./3  C  7.  D  .«  C  7- 

II.   h  («,/5,7):«  Q  13.^  C  7.   ^   .«  C  7- 

III.   h  («A7):c^  C  /3.^  C  7.   ^   .a  C  7. 

Proo/  of  Part  I.  Assume  a  C  (8  and  /?  C  7.  Then  a  C  /3,  a  5^  /3,  and 
/3  C  7.  By  q:  C  /3  and  |S  C  7,  we  get  a  C  7.  It  remains  to  prove  a  ?^  7, 
which  we  do  by  reductio  ad  absurdum.  Assume  a  =  y.  Then  by  substi- 
tuting in  /?  ^  7,  we  get  13  Q  a,  which  with  a  C  /?  gives  a  =  /3,  which  is  our 
contradiction. 

Proofs  of  Parts  II  and  III  are  similar.  _ 

Theorem  IX.4.23.     [-  {a,0)ia  C  18.  =  ./3  C  «• 

Proof.  Combine  Thm.IX.4.17  with  the  result  \-  a  9^  0.  =  .a  y^  ^,  which 
follows  from  \-  a  =  ^.  =  .a  —  ^,  which  follows  from  Thm.IX.4.7. 

Theorem  IX.4.24. 
I.  \-  (a,l3,y):.a  C  /3  H  7:   D  :«  C  /3.a  C  7- 
II.   h  (a,l3,y):.a  W  /3  C  7:   ^   :«  C  7-/3  C  7- 
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Proof.     Analogous  to  the  corresponding  portions  of  the  proof  of  Tlirn. 
IX.4.18. 

EXERCISES 

IX.4.1.     Construct  Venn  diagrams  to  show  that  each  of  the  following  is 
false : 

(a)  {a,l3,y):.a  Cl3:   3   -.a  n  y  C(^  r\  y. 

(b)  {a,^,y):.a  C  13:   D   :cx  ^  y  C  (3  ^  y. 

(c)  {a,0,y):.a  C  (3.a  C  y:   ^   :a  C  ^  r\  y. 

(d)  (a,^,y):.a  C  7-/3  C  T:  ^  :«  ^  /3  C  7- 

IX.4.2.  Illustrate  Thm.IX.4.13  by  a  Venn  diagram. 

IX.4.3.  Illustrate  Parts  VI,  XVI,  XIX,  and  XX  of  Thm.IX.4.4.  by  Venn 
diagrams. 

IX.4.4.  Prove: 

(a)  \-  MP.MQ:  D  :{x):X  e  xP  H  xQ.  =  .PQ. 

(b)  \-  aiP.aiQ:  D  :{x)2X  e  xP  U  xQ.  =  .PsQ. 

(c)  h  3.xP:  D  :{x):X  e  xP.  =  .~P. 

IX.4.5.     Prove: 

(a)  y3.xP.MQ:  D  :3i(PQ). 

(b)  \-  RxP.3xQ:  D  :3i(PvQ). 

(c)  1-  3iP.  D  .3x(~P). 

IX.4.6.     Prove: 

(a)  h  MP.^xQ:  D  :xP  r\xQ  -^  x(PQ). 

(b)  h  3xP.afQ:_D  :xP  \J  xQ  =  xIpwQ). 

(c)  h  3:rP.  D  JP  =  rc(~P). 

IX.4.7.     Prove: 

(a)  h  K/3,7).«  n  (/3  -  7)  =  («  n  /3)  -  (a  n  7). 

(b)  \r  (a,l3,y).(a  -  ^)  W  («  -  7)  =  «  -   (/3  n  7). 

(c)  h  (Q:,/3).(a  -  ^)  W  (/3  -  a)   =  («  W  ^)  -   (a  n  /3). 

(d)  h  (Q:,/3).a  -  (a  -  /3)  =  a  n  /3. 

(e)  h  (a,/3,7).(«  W  7)  n  (/3  W  7)  =  (a  n  7)  W  (|S  n  7). 
IX.4.8.     Prove: 

(a)  |-  (a,l3):a  \J  fi  =  a  r^  13.  =  .a  =  ^. 

(b)  \-  (a,l3):a  n  /?  =  A.  =  .(«  W  ^)  -  a  =  ^.  =  .(a  W  /3)  -  jS  =  a. 

(c)  h  (a,^):a  C  (8.  =  .a  W  (/3  -  a)  =  i3. 

(d)  h  («A7):.i8  C  a.a  -  /3  =  7:  D  :«  -  7  =  ^. 

(e)  \-  (a,^,y):a  n  y  =  A.   D   .{a  KJ  y)  -  (^  VJ  y)   =  a  -  fi. 

(f)  h  («,/5,7):.7  ^  a  W  ^:  D  :(Ed,4>)id  Q  a.4>  Q  ^.y  =  d  \J  (f>. 
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IX.4.9.     Prove: 

\-  {a,^):.a  Q  ^:  =  :a  C  /S.v.a  =  /?. 

IX.4.10.  The  quantity  (a  —  (3)  KJ  (0  —  a)  is  sometimes  called  the 
symmetric  difference  of  a  and  /3  or  the  sum  of  a  and  /3  (modulo  2).  Denote 
it  by  a  *  ^.    Prove: 

(a)  h  («,^).«  *  ^  =  13*  a. 

(b)  h  (a,/3,7).(«  *  /3)  *  T  =  «  *  (^  *  t). 

(c)  h  («,/3,7).(«  *  i3)  n  7  =   («  ^  t)  *  (iS  ^  t). 

(d)  h  («)■«  *  A  =  a. 

(e)  [-  {a).a  *  a  =  A. 

(f)  h  («,^,7):a  *  7  =  /3  *  7.   ^   .«  =  |8- 

(g)  h  («,/3):a  =  ^.  =  .a  *  ^  =  A. 

IX.4.11.     Illustrate  a  *  /3  by  a  Venn  diagram. 
IX.4.12.     Prove: 

(a)  h  (a,/3):/3  =   (a  n  /3)  W  ((a  VJ  ^)  -  «). 

(b)  \-  {a,l3,y):a  W  /3  =  a  W  7.a  n  /3  ==  a  H  7.  D   .|S  =  7. 

IX.4.13.  Name  parts  of  Thm.IX.4.4  which  are  illustrations  of  the 
duality  theorem  (Thm.IX.4.5). 

IX.4.14,  Name  pairs  of  parts  of  Thm.IX.4.4  which  are  duals  of  the  sort 
described  after  the  proof  of  Thm.IX.4.5. 

IX.4.15.  Write  a  250-word  essay  on  the  distinction  between  "member 
of"  and  "subclass  of". 

5.  Manifold  Products  and  Sums.  We  wish  to  generalize  the  product 
A  r\  Bto 

A  r\B  r\C  n  ■•• 

where  the  product  is  extended  over  all  A,  B,  C,  ...  which  are  members  of 
some  class  K.  We  denote  this  product  by  f)(K),  or  often  simply  Pl^- 
To  make  C\K  have  the  desired  properties  we  define  it  as 

x{a).a  e  K  D  X  e  a, 

where  x  and  a  are  distinct  variables  not  occurring  in  K.  The  reader  can 
easily  verify  that  this  definition  makes  C\K  consist  of  all  objects  which  are 
members  of  each  member  of  K  at  once;  that  is,  of  all  objects  in  the  product 
of  all  members  of  K.  This  definition  is  valid  whether  ii  is  a  finite  or  an 
infinite  class.  If  K  were  finite,  with  members  Ai,  A2,  .  .  .  ,  A„,  one  could 
use 

Ai  1^  A2  r\  •  •  •  n  ^„ 
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in  place  of  C\K.  So  the  notation  f]K  is  useful  mainly  in  those  cases  where 
K  is  an  infinite  class.  For  this  reason,  f)K  \s  often  called  an  infinite 
product.  However,  the  term  "manifold  product"  is  more  accurate  and  is 
the  name  which  we  shall  use  for  f]K. 

In  analogous  fashion,  we  wish  to  generalize  the  sum  A  \J  B  to 

A\J  B\J  C\J  '■■ 

where  the  sum  is  extended  over  a\\  A,  B,  C,  .  .  .  which  are  members  of  K. 
The  definition  \J{K),  or  U^,  for 

x(^a)\a  €  K.x  e  a 

(where  x  and  a  are  distinct  variables  not  occurring  in  K)  will  give  us  a 
notation  for  the  class  in  question,  since  it  will  make  {JK  consist  of  all 
objects  which  are  in  at  least  one  member  of  K.  We  shall  refer  to  IJ^  as  a 
manifold  sum,  though  it  is  sometimes  referred  to  as  an  infinite  sum. 

In  f\K  and  yjK,  we  refer  to  f\  and  (J  as  "large  cap"  and  "large  cup", 
respectively. 

Except  for  a  and  the  bound  variables  implicit  in  x,  which  are  all  bound, 
a  varial)le  is  free  or  bound  in  P\K  ov  {JK  according  as  it  is  free  or  bound 
\nK. 

(^K  and  \JK  are  stratified  if  and  only  if  K  is  stratified.  Moreover,  for 
stratification,  one  must  assign  to  P\K  and  \JK  types  one  less  than  the 
type  of  K.  That  is,  the  sum  or  product  of  the  members  of  K  should  have 
the  same  type  as  the  members  of  K,  namely,  one  less  than  the  type  of  K. 

If  A  contains  free  occurrences  of  x,  but  A  and  B  have  no  free  variables 
in  common,  then  C\{A\x  eB]  and  \J{A\x  eB\  are  commonly  denoted  by 

n  A         and  XI  ^ 

X(B  jtB 

respectively.  Other  notations  for  the  same  thing  are  to  be  found  in 
Bourbaki,  1939,  pages  20  to  25,  and  Hausdorff,  1927,  page  18. 

The  notions  indicated  by  0  { ^  I  ^  e  5 }  and  U  { ^  \  ^  ^  B}  are  of  fre- 
quent occurrence  in  the  theory  of  sets.  As  an  example,  H  {^  I  en  ^  K}  is  the 
product  of  all  complements  of  members  of  K.    A  generalization  of  the  law 


a  r\  (3  =  a\J 
would  be 


n{a  \aeK}    =   \JK. 
Similarly,  a  generalization  of 


would  be 


U{a\aeK}   =  OK, 
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a  generalization  of 

oil^i^KJy)  =  (a  n  /3)  W  (a  n  7) 
would  be 

and  so  on.  Indeed,  many  properties  of  f)K  and  U^  are  generalizations 
of  properties  of  finite  products  and  sums.  Most  theorems  of  this  section 
are  analogues  of  theorems  of  the  preceding  section. 

Analogous  to  Thm.IX.4.1,  we  have: 

Theorem  IX.5.1. 

I.  h  (X)  anx. 

II.  h  (X)  3UX. 

Analogous  to  Thm.IX.4.2,  we  have: 

Theorem  IX.5.2. 
**I.  |-  (X,x):.x  €  C\\:  =  :(a).a  e  X  D  x  e  a. 
**II.  [-  (\,x):.X  e  \JX:  =  :(Ea).a  e  \.x  €  a. 

Analogous  to  the  associative  laws,  Thm.IX.4.4,  Parts  X  and  XI,  we  have: 

Theorem  IX.5.3. 

I.  [-(x,/^).nxnnM  =  n(xwM). 

II.  h(M.UX^UM  =  U(xwm). 

Proof  of  Part  I . 

|-a;€nXnnM:. 

=  :.X  e  f)X.X  e  Hm:. 

=  -..(a). a  e\  D  X  e  a:(a).a  e  fj.  D  x  e  a:. 

^  :.(«):«  e  \  D  X  e  a.a  e  jjl  D  x  e  a:. 

^  :.(a):a  e  Xva  e  fx.   D   .x  e  a;. 

^  :.{a):a  e  X  KJ  n.   D   .x  e  a:. 

^  -..x  e  n(X  W/i). 

Proo/o/ Pari  II. 

.     |-a;eUXwUM:. 

.(Eq;):q;  e  X.x  e  q;:v:(Eq;):Q!  e  /z.a:  e  a:. 
.(Eq:):q;  e  X.x  e  a. v. a  e  fi.x  e  a:, 
.(Ea):a  €  Xya  e  ji.x  e  a:. 
.(Ea):a  e  X  VJ  /x.a;  €  a:, 
.a:  e  U(X  W/x). 

Analogous  to  Thm.IX.4.4,  Parts  XXX  and  XXIX,  we  have: 
Theorem  IX.5.4. 

I.   \-  (X):A  eX.   D   .fix  =   A. 
II.  \-  (X):V  e  X.  D   .UX  =  V. 
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Proof  of  Part  I.  Assume  A  e  X.  Putting  A  e  X  and  a:  e  A  for  P  and  Q  in 
|-  P  D  :^Q.  D  .^(P  D  Q),  we  get  ~(A  eX  D  x  e  A).  So  (Ea)~(a  e  X  D 
x  €  a).  So  "^(aj.a  e  \  D  x  e  a,  and  hence  ^^  x  e  f)X.  So  (x).'^  a;  e  OX 
By  Thm.IX.4.12,  Part  II,  f]^  =  A. 

Proof  of  Part  II  is  similar. 

Analogous  to  Thm.IX.4.13,  Cor.  6  and  Cor.  7,  we  have: 

Theorem  IX.5.5. 
I.   |-  {X,a):a  e  X.   D   .QX  C  a. 
II.   h  (X,a):a  eX.   D   .a  Q  (JA- 

Proof  o/  Par/!  I.  By  Axiom  scheme  6,  |-  .r  e  H^:  ^  :a  eX.  D  .a;  e  a.  So 
h  a  €  X:  D  :a;  €  n^.  ^  -^  e  a.  So  by  rule  G  and  Thm.VI.6.6,  Part  IX, 
|-  a  e  X:   D   :(x):X  e  fl^.   ^   .X  e  a. 

Proof  of  Partll.  By  Thm.VI.7.3,  [-«  eX.a;  e«.  D  .x  e  IJA-  Sof-aeX:  D  .- 
X  e  a.   D   .X  e  {JX,  and  so  |-  a  e  X:   D   :{x):X  e  ol.   D   .x  e  {JX. 

Corollary.     |-  (X):X  ?^  A.  D  .flX  e  (JX. 

Proof.     Use  Thm.IX.4.12,  corollary,  Part  II,  and  Thm.IX.4.15. 

A  different  generalization  of  Thm.IX.4.13,  Cor.  6  and  Cor.  7,  is  given  bv 
the  statements  that  a  product  of  several  classes  includes  a  product  of  still 
more  classes,  and  a  sum  of  several  classes  is  included  in  a  sum  of  still  more 
classes.    These  are  expressed  in : 

Theorem  IX.5.6. 

I.   h  (M:X  C  ;x.   D   .flM  C  fix. 
II.   h  (A,m):X  ^  M.  ^   .UX  C  Um. 

Proo/  o/  Par^  I.  Assume  X  Q  fi.  Then  aeXDae^u.  So  a  e  fx  D  x  e  a. 
D   .a  e  X  D   X  e  a.     So  (a). a  e  ^   ^   a-'  «  «:   3   :(a).a  e  X   D  a;  e  a.     This  is 

X  €  Hm-  3  .a;  €  n^- 

Proof  of  Partll.     Assume  X  C  ju.    Then  a  e  X  D  a  e /z.    So  a  e  X.a;  e  a.  D 
a  e  ii.X  e  a.     So  (Eq;).q:  e  X.a*  6  a:   D   :(Eq:).q:  e  /x.a:  e  a. 

The  generalizations  of  Thm.IX.4.16,  Cor.  1  and  Cor.  2,  would  be  that  if 
one  can  make  the  members  of  X  and  ix  correspond  in  such  a  way  that  everv 
member  of  X  is  included  in  the  corresponding  member  of /x,  then  f\X  <^  f^ii 
and  U^  ^  Um-  One  can  prove  slightly  stronger  results,  which  we  express 
in  the  following  theorem. 

Theorem  IX.5.7. 
I.   h  (X,m)::(^):/3  eM.   3   .(Ea).a  e  X.a  C  ^.:   D   j.QX  ^  Hm- 
II.   h  (X,M)::(a):a  e  X.   D   .(E/3)./?  e  m.«  ^  /3.:   D   :.UX  C  (Jm- 

Proof  of  Part  I.     Assume 

(1)  ()S):/3eM.   3   .(Ea).Q;eX.a  C  jS, 

(2)  a:  6  nx, 

(3)  ^€M. 
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By  (3)  and  (1),  (Ea).a  e  X.a  C  (3,  so  that  by  rule  C 

(4)  a  e  X, 

(5)  aQl3. 

By  (4)  and  (2),  x  e  a.    Then  by  (5),  x  e  0.    So  we  have  proved 

(1),  a:€nx, /3eM[-^«/3. 
By  the  deduction  theorem 

By  rule  G 

(l),.T6nxh^^nM. 

Proof  of  Part  11.    Assume 

(1)  (a):a  e  \.  D   .{El3).l3  e  fi.a  Q  ^, 

(2)  X  e  UX. 
Then,  by  (2)  and  rule  C 

(3)  «eX, 

(4)  X  e  a. 


So  by  (3)  and  (1),  (E^)./3  e  n.a  C  /3. 
By  rule  C 


(5) 
(6) 


By  (4)  and  {<6),x  e  13.    Using  this  and  (5),  we  have  (E/3).jS  e  n.x  e  /3.    That 
is,  X  e  \Jfi. 

Analogous  to  Thm.IX.4.18,  we  have: 

Theorem  IX.5.8. 

I.   \-  (\,a):.a  C  flX:  =  :(/3):/3  e  X.   D   .a  C  /3. 
*II.   h  (X,7):.UX  ^  T:  =  :(^):/3  6  X.   D   .^  C  7. 

Proof  of  Part  I. 

.(a;):a;  e  a.  D  .a;  e  OX:. 
:.(a;):a;  e  a.   D   .(/3).i3  e  X  D  x  e /?:. 
.(a;,,8):.r  e  a.  D  ./3  e  X  D  a;  e  ^:. 
.(,S,.t):/3  6  X.  D  .X  €  q:  D  a;  e/3:. 
.(^):^  e  X.  D   .(.r)..r  e  a  3  .r  e  ^:. 
.(^):/3  e  X.   D   .a  C  /3. 
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D   .3^  e  7:. 

(a;):(E^)./3  t\.x  e /3.  D  ..x  e  7:. 
{x,l3):0  e  \.x  €  (3.   D   .X  e  y.. 
(/3,.r):/3  e  X.   D   ..c  €  /3  D  .c  e  7:. 
(I3):0  e  X.   D   .(a:;. a:  e  jS  D  X  e  y.. 
(^):/3  e  X.   D   ./3  C  7. 


Analogous  to  Thm.IX.4.24,  we  have: 
Theorem  IX.5.9. 

I.   1-  (\a):.a  C  fl^:   ^   :(^):/3  e  X.   D   .a  C  /?• 
11.   h  (X,7):.UA  C  7:   ^  :(^):^  6  X.   3  .^  C  7- 
Proof  of  Part  I.     Assume 


(1) 
(2) 


«c  nx, 

/3  e  X. 


By  (2)  and  Thm.IX.5.5,  flX  ^  /3.    By  this  and  (1),  a  C  ./?. 

Proof  of  Part  II  is  similar. 

We  have  already  indicated  what  are  the  generalizations  of  Thm.IX.4.4, 
Parts  XVI  and  XVII.    To  prove  these,  we  need  certain  preliminary  results- 
Theorem  IX.5.10. 

I.   \-  (X)3{a  I  a  eX}. 
II.   [-(X,/3):iS  e  {5  I  a  eX}.  =  ./?  eX. 
III.  h  (X,^):/3  €  {a  I  a  €  X}.  =  .(Ea).a;  e  X./3  =  a. 

Proof  of  Part  I.     Use  Thm.IX.3.1. 

Proof  of  Part  II.     Use  Thm.IX.4.8. 

Proof  of  Part  III.     Use  Thm.IX.3.2,  Cor.  3. 

We  now  get  the  generalizations  of  Thm.IX.4.4,  Parts  XVI  and  XVII. 

Theorem  IX.5.11.  

I.  [-(x)-n{«l«ex!  =  yx. 

II.  h(X).Ul«l  ae\]   =  nX- 

Proof  of  Part  I.    Using  Thm.IX.5.10,  Part  III,  and  Thm.VII.1.5,  Part  11. 
we  get 

\-  X  €  C\{a  \  a  e  X}:. 

.(/3):/3  e  {a\  a  e\}.   D   ..r  e /3:. 
.(l3):{Ea).a  e  \.l3  =  a.   D   .x  e  0:. 
.(I3,a):a  e  X./3  =  a.   D   .x  e  (3:. 
.(a,^):a  e  \.   D   .^  =  a  D  x  e  13:. 
.{a):a  e  X.  D  .(/3)./3  =  a  D  x  e  (3:. 
.(a):a  e  X.   D   :X  e  a:. 
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Proof  of  Part  11. 
we  get 


.(q;):q:  e  X.  D  ."^  X  c  a:. 
.'^(Ea).a  €  \:X  e  a:. 
.'^  a;  c  UX;. 

Using  Thm.IX.5.10,  Part  III,  and  Thm.VII.1.5,  Part  I, 

\-  X  e\J[a\  a  e  X}:. 

.(EI3):I3  e  {a  I  a  eXj.x  €/3:. 
.(E|8):(Ea).a  e  X.^S  -  aiX  €  j8:. 
.(E,S,a):a  e  X./3  =  a.x  e  /3:. 
.(Ea):Q:  e  X.(E/3).,S  =  a.:r  €  A. 
.(EQ:):a  e  \.X  e  a-.. 
.(Ea)ia  e  X.'^^  x  €  a:. 
.~(/3).i(3  e  X.  D  .a:  c  /3:. 

=  :.x  €  nx- 

We  have  by  no  means  listed  all  possible  generalizations  of  theorems  of 
Sec.  4,  but  we  have  listed  those  of  importance. 

One  particularly  important  use  of  manifold  products  is  in  defining  the 
closure  of  a  class  with  respect  to  an  operation  or  a  set  of  operations.  For 
example,  when  we  wish  to  adjoin  a  root  6  of  an  algebraic  equation  P(6)  =  0 
with  rational  coefficients  to  the  field  of  rational  numbers,  we  wish  the  least 
class  R(d)  such  that: 

(A)  R(d)  includes  6  and  all  rationals. 

(B)  R{d)  is  closed  under  each  of  the  operations  of  adding  together  two 
members  of  R(d),  multiplying  together  two  members  of  R(6),  and  multiply- 
ing any  member  of  R{d)  by  a  rational. 

If  we  let  R  denote  the  class  of  rationals,  we  can  rewrite  (A)  and  (B)  as 
follows. 

(Al)  R  C  R{e). 

(A2)  eeR{d). 

(Bl)  {x,y):x,y  e  R(e).  D  .x -^  y  e  R(e). 

(B2)  (x,y):x,y  e  R(e).  D  .xy  e  Rid). 

(B3)  (x,y):x  c  R(e).y  e  R.  D  .xy  t  R{d). 

We  express  (Bl),  (B2),  and  (B3)  by  saying  that  R{e)  is  closed  with 
respect  to  the  operations  "plus,"  "times,"  and  "multipHcation  by  a  ra- 
tional." 

Closure  with  respect  to  a  relation  often  occurs  and  is  more  general.    Thus, 
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if  we  let  Si(x,y,z),  S2(x,y,z),  and  SziXjZ)  denote,  respectively,  x  -\-  y  =  z, 
xy  =  z,  and  (Ey).y  e  R.xy  =  z,  then  we  can  rewrite  (Bl),  (B2),  and  (B3)  as: 

(Bl)  (x,y,z):x,y  e  R(e).S,ix,y,z).  D  .z  e  R(e). 
(B2)  (x,y,z):x,y  e  R(d).S,(x,y,z).  D  .2  e  R{d). 
(B3)     {x,z):x  €  R{e).S3(x,z).  D  .z  e  R{e). 

In  this  form,  we  express  (Bl),  (B2),  and  (B3)  by  saying  that  R(d)  is 
closed  with  respect  to  the  relations  Si,  S2,  and  S3. 
Finally,  we  can  write  S(x,y,z)  for 

Si(x,y,z)vS2{x,y,z)ySs(x,z), 

and  then  (Bl),  (B2),  and  (B3)  can  be  telescoped  to: 

.^  (B)     {x,y,z):x,y  e  R(d).S(x,y,z).  D  .z  e  R(d). 

This  is  expressed  by  saying  that  R(6)is  closed  with  respect  to  the  relation  S. 
The  discussion  above  should  make  it  clear  that  closure  with  respect  to 
any  set  of  operations  or  any  set  of  relations  can  be  reduced  to  closure  with 
respect  to  a  single  relation.  In  general,  this  might  be  a  relation  between 
many  things,  so  that  one  would  wish  to  write  it  as  S(xi,  .  .  .  ,  Xn,  z).  Even 
more  generally,  it  might  involve  various  parameters  ri,  .  .  .  ,  r„,  so  that  one 
would  write  it  as  S{xi,  .  .  .  ,  Xn,  z,ri,  .  .  .  ,  r„).  The  closure  of  a  class  A  with 
respect  to  S  will  be  denoted  by 

Clos(yi,  S{xi,  .  .  .  ,  Xn,  z,  Ti,  .  .  .  ,  rj). 

More  precisely,  we  shall  make  the  following  definition,  due  essentially 
to  Frege  (see  Frege,  1879). 

When  accompanied  by  a  specification  as  to  which  free  variables  of  P 
are  to  be  considered  as  a:i,  .  .  .  ,  x,,,  z,ri,  .  .  .  ,r„,  respectively,  and  provided 
that  there  are  no  free  occurrences  oi  Xy,  .  .  .  ,  x^,  z  in  A, 

C\os{A,P) 
shall  denote 

n3(A  Q  /3:.(xi,  .  .  .  ,Xn,  z):Xi,  .  .  .  ,Xn€  p.p.  D  .z  e  /3), 

where  /3  is  a  variable  not  appearing  in  A  or  P. 

In  Clos(A,P),  the  variables  Xx,  . . .  ,Xn,z  are  bound,  as  also  are  the  bound 
variables  implicit  in  n3  and  A  C  /3;  other  than  these,  variables  occur  free 
or  bound  in  Clos(A,P)  according  as  they  occur  free  or  bound  in  A  or  P. 

A  set  of  general  rules  for  determining  if  Clos(A,P)  is  stratified  without 
writing  it  out  would  be  too  complicated  to  be  worth  stating.  If  the  ques- 
tion of  stratification  of  Clos(yl,P)  arises,  the  best  procedure  is  to  write  out 
the  definition.  However,  one  should  note  that,  if  it  is  stratified,  it  must 
have  the  same  type  as  A. 
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Temporarily  throughout  the  next  five  theorems,  let  H(A,^,P)  denote 

A  C  I3:.(xi,  .  .  .  ,  x„,  z):Xi,  .  .  .  ,  x„  €  /3.P.  D  .z  e  0, 

where  )8  is  distinct  from  Xi,  .  .  .  ,  x^,  z  and  does  not  occur  in  A  or  P,  so  that 
Clos(A,P)  =  nm(AAP). 

The  meaning  of  H(A,0,P)  is  that  /3  includes  A  and  is  closed  with  respect 
to  the  relation  P.  Such  /3's  exist,  for  example,  /3  =  V.  It  is  less  obvious 
that  Clos(A,P)  is  such  a  j8,  but  we  shall  prove  this.  As  Clos(^,P)  is  the 
logical  product  of  all  such  jS's,  it  is  necessarily  the  least,  by  Thm.IX.5.5, 
Part  I. 

By  putting  m{A,^,P)  for  X  in  Thm.IX.5.1,  Part  I,  we  get  \-  aClos(^,P). 
Surprisingly  enough,  this  fact  is  not  particularly  useful.  What  is  required 
to  prove  our  theorems  about  Clos(A,P)  is  the  hypothesis  3^11  {A, ^,P). 
Actually,  ^H(A,0,P)  is  stratified  whenever  Clos(^,P)  is  stratified,  so  that 
the  hypothesis  3^H{A,^,P)  will  be  available  in  all  those  cases  in  which  one 
would  expect  to  be  able  to  do  anything  significant  with  Clos(A,P). 

Theorem  IX.5.12.  h  ia):Xm(cc,l3,P):  D  :{x):X  e  Clos(a,P).  =  .(/3). 
H{a,l3,P)   D  X  e^. 

Proof.    By  Thm.IX.5.2,  Part  I,  h  {x):x  ^  Clos(a,P).  =  .(^).^  e  H^ia, 
fi,P))    D   X  e  (3.     However,  by  Thm.IX.3.2,  Cor.  1,  |-  ^^H{a,l3,P):  D  -.' 
(/3)./3  6  5(i/(«,/3,P))  ^H(a,^,P). 

Theorem  IX.5.13.     [- {a):3^H{a,(3,P).  D  .a  Q  C\os(a,P). 

Proof.  Assume  :^^H{a,l3,P).  Then  by  Thm.IX.3.2,  Cor.  1,  and  the 
definition  of  H(a,^,P),  (/3):/3  e  ^iH{a,l3,P)).  D  .a  Q  13.  So,  putting 
^H{a,0,P)  for  X  in  Thm.IX.5.8,  Part  I,  we  get  our  theorem. 

Theorem  IX.5.14.  \-  (a):.3'^H(a,^,P):  D  :(xi,  .  .  .  ,  X„,  z):Xi,  ...  ,  .t„  e 
Clos(a,P).P.   D   .Z  e  Clos(a,P). 

Proof.     Assume 

(1)  33/^(«,/3,P), 

(2)  x„  .  .  .  ,Xne  C\os(a,P).P, 

(3)  Hia,^,P). 
By  (1),  (2),  (3),  and  Thm.IX.5.12, 

(4)  x„  ...  ,x„e  13. P. 

Using  (4),  (3),  and  the  definition  of  H(a,(3,P),  we  infer  z  e  0.    So  we  have 
shown 

(1),  (2),  H(a,^,P)  \-zel3. 
So 

(l),(2)[-((3).H{cc,^,P)  3^6/3. 

So  by  Thm.IX.5.12, 

(1),  (2)|-2  6Clos(a,P). 
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Thm.IX.5.13  says  that  a  is  included  in  Clos(a,P),  and  Thm.IX.5.14 
says  that  Clos(a;,P)  is  closed  with  respect  to  the  relation  P.  We  wish  to 
verify  also  that  Clos (q!,P)  is  the  least  class  which  includes  a  and  is  closed 
with  respect  to  P.  However,  this  is  given  to  us  by  Thm.IX.5.12.  For  let 
13  include  a  and  be  closed  with  respect  to  P.  That  is,  assume  H(a,/3,P). 
Then,  by  Thm.IX.5.12,  (.t)..t  e  Clos(a,P)  D  x  e  (3.  So  Clos(a,P)  Q  ^,  and 
/3  is  no  smaller  than  Clos(a,P). 

Actually,  we  need  a  stronger  version  of  the  result  that  Clos(a:,P)  is  the 
least  class  which  includes  /3  and  is  closed  with  respect  to  P.  This  result  we 
now  prove. 

Theorem  IX.5.15.  h  {o'):-MH{oc,^,P).:  D  :.(/3):.«  C  ^:(x„  .  .  .  ,  X„,  z): 
Xi,  .  .  .  ,  x„  e  ^.Xi,  .  .  .  ,  Xn  e  C1os(q:,P).P.  D   .Z  e  13:   D  :Clos(a,P)  Q  /3. 

Proof.     Assume 

(1)  ^m{aAP), 

(2)  a  C  /3:(a;i,  .  .  .  ,  rc„,  z):Xi,  .  .  .  ,  a;„  e  i3.x^,  .  .  .  ,  a;„  e  Clos(a,P).P.  D  .2  €  /3, 

(3)  X  e  Clos(a,P). 

Temporarily  write  A  for  /3  n  Clos(a,P).    Then  by  Thm.IX.4.2,  Part  III, 

(4)  |-  {io):W  e  A.  =  .IV  e  (3.w  e  Clos(a,P). 
Then  by  (2), 

{xi,  .  .  .  ,  x,„  z):Xi,  .  .  .  ,  Xn  e  A.P.  D  .z  €  13. 
So  by  (1)  and  Thm.IX.5.14, 

(5)  (xi,  .  .  .  ,  Xn,  z):Xi,  .  .  .  ,  x„  e  A.P.  D  .z  e  A. 

Also  by  (2),  we  get  a  C  /3  and  by  (1)  and  Thm.IX.5.13  we  get 
a  C  Clos(a,P).    So  by  Thm.IX.4.18,  Part  I, 

(6)  a^  A. 

Then  by  (5)  and  (6),  we  have  H{a,A,P).  Using  this  and  (3)  and  (1) 
with  Thm.IX.5.12,  we  get  a:  e  A.  So  by  (4),  we  get  finally  a;  e  /3.  Since  we 
derived  this  from  (1),  (2),  and  (3),  our  theorem  follows. 

One  may  think  of  C1os(q:,P)  as  being  generated  as  follows.  Start  with  a. 
As  a  first  step  toward  achieving  closure  with  respect  to  P,  add  to  a  all  z's 
such  that  Xi,  .  .  .  ,Xn  are  in  a  and  P  holds.  If  we  call  the  resulting  enlarged 
class  /3i,  we  enlarge  again  by  adding  to  j8i  all  z's  such  that  x^,  .  .  .  ,  x^  are  in 
/3i  and  P  holds.  We  can  call  the  resulting  class  ^2  and  enlarge  again  and 
again  to  get  ^3,  ^i,  .  .  .  .    Then 

Clos(a,P)   =  a:  W  /3i  W  /32  W  .  .  .  . 
We  cannot  prove  this  result  exactly  until  we  are  in  a  position  to  define  the 
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sequence  /Si,  02,  •  •  •  in  the  symbolic  logic.  However,  the  next  theorem 
makes  the  result  look  plausible.  One  can  interpret  the  next  theorem  as 
saying  that,  if  z  is  in  Clos(a:,P),  then  either  z  was  in  a  to  begin  with  or  else 
z  was  put  in  because  some  Xi,  .  .  .  ,x„  had  already  been  put  in  and  Xi,...,Xn 
had  the  relation  P  to  z. 

Theorem  IX.5.16.  \-  {a)yMH{a,0,Py.3.z{z  e  a.v.(Ea:x,  .  .  .  ,  a:„). 
X,,  .  .  .  ,  .T„  e  Clos(a,P).P).:  3  :.(2):.2  e  Clos(a,P):  =  -.z  e  a.v.(EXi,  .  .  .  ,  a;„). 
X,,  .  .  .  ,  a:„  6  Clos(a,P).P. 

Proof.     By  Thm.IX.5.13  and  Thm.IX.5.14,  we  easily  infer 

(1)  \-  3^H{a,0,P).i   D   :.{z):.Z  e  a.y.{^X^,  ...  ,  Xr).x„  .  .  .  ,  rc„  c  Clos(a,P). 

P-.D'.Ze  Clos(a,P). 

Now  assume 

(2)  R^H(a,0,P), 

(3)  32(2  e  a.v.(Exi,  .  .  .  ,  a;„).a;,,  .  .  .  ,  a;„  c  Clos(a,P).P). 

Temporarily  let  A  denote 

z{z  €  a.y/.(Exi,  .  .  .  ,  x„).Xi,  .  .  .  ,  a;„  e  Clos(a,P).P). 

Then  by  (3)  and  Thm.IX.3.2,  Cor.  1, 

(4)  {z):.z  e  A:  =  iz  i  a.w.(Exi,  .  .  .  ,  Xn).Xr,  .  .  .  ,  a;„  e  Clos(a,P).P. 
From  this,  we  readily  infer 

(5)  aQA, 

(6)  (xi,  .  ..  ,  Xn,  z):Xi,  ...  ,x„  e  A.x^,  .  .  .  ,  rc„  e  Clos(a,P).P.  D  .z  e  A. 
Then,  by  (2),  (5),  (6),  and  Thm.IX.5.15, 

(7)  Clos(a,P)  C  A. 
By  (4),  we  have  then  shown 

(2),  (3)  h  {z):.z  €  Clos(a,P):  D  :Z  e  a.v.(Ea:i,  .  .  .  ,  x^).x^,  ...,X„€  C\os(cx,P).P. 
Combining  this  result  with  (1)  gives  our  theorem. 

EXERCISES 
IX.5.1.     Prove: 

(a)  h  flA  =  V. 

(b)  1-UA  =  A. 

(c)  h  nv  -^  A. 

(d)  hUV  =  v. 

IX.5.2.     State  and  prove  the  generalization  of  Thm.IX,4.4,  Part  XII. 
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IX.5.3.     State  and  prove  the  generalization  of  Thm.IX.4.4,  Part  XIII. 

IX.5.4.  Suppose  that  0  is  stratified  and  has  no  free  variables,  and 
a:  +  1  is  stratified  and  has  the  same  type  as  x  and  has  only  a;  as  a  free  vari- 
able, and  we  define  Nn  as 

:6((i8):.0  e  I3:(y).y  e  13  D  y  +  1  e  ^:  D  :x  e  0). 

Prove : 

(a)  \-0  e  Nn. 

(b)  \-  {x):x  e  Nn.  D  .a;  +  1  e  Nn. 

(c)  h  (/3)::0  e  ^:.iy):y  e  ^.y  e  Nn.   D   .2/  +  1  e  ^.:   D   :.Nn  C  /3. 

(d)  h  {x):.x  e  Nn:  =  :X  =  0.v.(Ey).y  e  Nn.a;  =  y  -\-  1. 

{Hint.     Take  a  to  be  z{z  =  0)  and  P  to  be  a:  +  1  =  0.) 

IX.6.6.  Identify  the  class  Nn  of  the  previous  exercise  as  a  class  familiar 
in  mathematics. 

IX.6.6.  Identify  part  (c)  of  Ex.IX.5.4  as  a  familiar  principle  of  mathe- 
matics. 

IX.5.7.     Let  us  say  that  a  is  the  least  member  of  X  if 

a  €  X:(i3).jS  e  X  D  a  C  /S. 
Prove  the  following  results  about  least  members  of  X. 

(a)  \-  (\,a):.a  e  X:(/3).i8  ^  X  D  a  Q  ^:  D  -.a  =  f)^. 

(b)  \-  {X,a):.a  e  X.a  =  flX:  ^  :a  e  X:(/3)./3  e  X  D  a  Q  fi. 

(c)  \-  (X):.  fix  e  X:  =  :(E«):q:  e  X:{^).^  e  X  D  a  C  /3. 

(d)  h  (X)::(Ea):a  e  X:(/3).^  e  X  D  a  C  /?.:  D  :.(Eia):a  e  X:(^).i8  e  X  D  a  C  /3. 

IX.5.8.     Give  an  illustration  of  a  class  X  which  has  no  least  member  in 
the  sense  of  Ex.IX.5.7. 
IX.5.9.     Prove  h  U  {U«  I  «  e  X}  =  U(UX). 

6.  Unit  Classes  and  Subclasses.  The  unit  class  of  ^,  {^},  is  the  class 
whose  sole  member  is  A.    That  is,  we  define 

{A}         for        x(x  =  A) 

where  x  is  a  variable  not  occurring  in  A.    The  class  whose  sole  members 
are  A  and  B  will  be 

{A}KJ{B}, 
which  we  shall  denote  by 

{A,B}. 

Clearly  A  and  B  need  not  be  distinct.     However,  when  A  and  B  are  the 
same,  one  easily  proves  that  {A,B}  and  {A}  are  the  same. 
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More  generally,  the  class  whose  sole  members  are  Ai,  .  .  .  ,  A„  is 

[A,]  \J  '■•  W  {A„}, 
which  we  shall  denote  by 

Ml,  ...,A„}. 

Except  for  bound  variables  implicit  in  x  and  W,  variables  are  free  or 
bound  in  {^i,  .  .  .  ,  ^„}  according  as  they  are  free  or  bound  m  Ai,  .  .  .  ,  A^. 
\A]  \s  stratified  if  and  only  if  A  is  stratified,  and  if  [A]  is  stratified,  its 
type  is  one  higher  than  the  type  of  ^.  {Ai,  .  .  .  ,  ^„}  is  stratified  if  and 
only  \i  Ai  =  A2  —  •  •  ■  =  A„\Q  stratified.  In  particular,  for  stratification 
of  {^1,  .  .  .  ,  An],  each  of  Ai,  .  .  .  ,  A^  must  have  the  same  type,  and  the 
type  oi  \Ai,  .  .  .  ,  An]  must  be  one  higher  than  this. 

Theorem  IX.6.1. 
I.  l-(^)3{-^}- 

II.     h    (•'^1,    ■    •    ■    ,Xn)    3.{Xi,    .    .    .    ,Xn]. 

Theorem  IX.6.2. 

**I.  \-  ix,y):y  e  {x],  ^  .y  =  x. 
II.  \-  (x,y,z):z  e  {x,y].  =  .z  =  xvz  =  y. 

III.  \-  {x^,  .  .  .  ,Xn,  y):y  e  {xi,  .  .  .  ,  xj.  =  .y  =  x^y  •  •  •  vy  =  x^. 
*Corollary  1.     \-(x).xt{x]. 
Corollary  2.     \- (x,y).x,y  e  {x,y]. 

Corollary  3.     \-  (xi,  .  .  .  ,  Xn).Xi,  .  .  .  ,  a;„  e  {xi,  .  .  .  ,  x„}. 
♦Theorem  IX.6.3.     \- (x,y):{x]  =  {y].  D  .x  =  y. 
Proof.     Assume  {x]  —  \y].    Then,  since  a:  e  {.r|  by  Thm.IX.6.2,  Cor.  1, 
we  get  X  e  {y],  whence  we  get  .-c  =  ^  by  Thm.IX.6.2,  Part  I. 
Corollary  1.     \- (x,y):{x]  =  {y].  ^  .x  =  y. 
Corollary  2.     \-^{{x]  \  P].-.  D  :.(x):{x]  e  {{x]  |  P}.  =  .P. 
Historically,  it  has  been  popular  to  define  x  =  y  as 

(1)  (a):X  e  a.   D   .yea. 

The  justification  for  this  is  based  on  the  following  argument.  The  basic 
characteristic  of  equality  is  that  expressed  in  Axiom  scheme  7A,  namely, 
that  ii  X  =  y,  then  any  statement  which  is  true  of  x  shall  be  true  of  y.  As 
each  statement  determines  a  class  a  (?),  Axiom  scheme  7 A  is  equivalent  to 
saying  that  y  is  in  every  class  a  which  contains  x,  which  is  just  the  condition 
(1)  given  above.  In  the  present  system,  not  every  statement  determines  a 
class,  which  somewhat  spoils  the  above  argument  and  makes  (1)  less 
intuitive  as  a  definition  of  equality.  However,  it  could  be  used,  as  is  shown 
by  the  next  theorem. 

In  the  Zermelo  set  theory,  (1)  could  not  be  used  at  all  because  of  the  exist- 
ence of  nonsets  (or  nonindividuals)  which  cannot  be  members  of  anything. 
Thus,  if  a;  is  a  nonset,  a;  e  a  is  false  for  all  a  and  so  (1)  would  be  true  for  all  y. 
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Thus  in  Zermelo  set  theory,  one  is  forced  to  use  our  definition  of  equaUty 
or  else  to  take  equality  as  undefined.  Our  definition  of  equality  is  suitable 
only  in  case  all  objects  are  classes. 

If  we  wish  to  enlarge  the  present  system  by  the  addition  of  nonclasses, 
we  would  wish  to  change  to  (1)  as  a  definition  of  equality.  If  we  wish  to 
have  both  nonclasses  and  nonmembers,  then  we  probably  have  to  take 
equality  as  undefined,  with  appropriate  axiom  schemes,  such  as  Axiom 
schemes  7 A  and  7B,  for  instance. 

Theorem  IX.6.4.     \-  (x,y):.x  =  y-.  ^  :(a).x  e  a  D  yea. 

Proof.     By  Axiom  scheme  7,  we  have 

(1)  \-  X  —  y:   D   :(a).X  e  a  D  yea. 

Assume  {a).x  e  a  D  yea.  Then  by  Axiom  scheme  8,  x  e  {x}  Dye  {x]. 
Then  by  Thm.IX.6.2,  Cor.  1,  ?/  e  {a;},  and  so  by  Thm.IX.6.2,  Part  I,y  =  x. 

Corollary.     \-  (x,y):.x  =  y-.  =  :{a).x  e  a  =  y  e  a. 
*Theorem  IX.6.5.     |-  (a,x):x  e  a.  =  .{x}  Q  a. 

Proof.  By  Thm.VII.1.5,  Part  ll,lx  ea:  ^  ■.{'y).y  =  x  D  y  e  a.  Then 
by  Thm.IX.6.2,  Part  l,\-x  ea-.  =  •.{y).y  e  {x\  D  y  ea.    This  is  our  theorem. 

Corollary  1.     |-  {a,x):X  e  a.  ^  .{x}  r\  a  =    {x} . 

Corollary  2.     \-  {a,x):x  e  a.  =  .{x}  KJ  a  =  a. 

Corollary  3.     \-  {a,x):X  e  a.  =  .(a  —   {x\)  yj  {x}  =  a. 

Proof.     Use  Thm.IX.4.13  and  Thm.IX.4.4,  Part  XIX. 

Theorem  IX.6.6.     |-  {a,x):'^  x  e  a.  =  .a  Q  [x]. 

Proof.     \-  '^  X  e  a:  =  :X  ea:  =  :{x}  Qa:  =  la  Q  {x},  by  Thm.IX.6.5  and 
Thm.IX.4.17. 
^Corollary  1.     |-  {a,x):'^  x  e  a.  =  .{x}  n  a  =  A. 

Corollary  2.     |-  (a,x):X  e  a.  =  .{x}  r\  a  9^  A. 

Theorem  IX.6.7.     [-  (a,x):{x}  n  a  =-  A.  y  .{x]  r\  a  ^  {x}. 

Proof.  By  Thm.IX.6.6,  Cor.  2  and  Thm.IX.6.5,  Cor.  1,\-  [x]  n  a  9^  A. 
=  .{x\  r\  oc  =  {x}.  ^o  [-  {x]  r\  a  7^  A.  D  .{x}  r\  a  —  {x}.  This  is  our 
theorem. 

Corollary.     \-  (a,x):a  Q  {x}.  D  .a  =  A  v  a  =  {x}. 

For  if  a  C  {x},  then  a  =   {x}  r\  a. 

Theorem  IX.6.8.     [- {x,y):x  9^  y.  =  .{x}  r\  {y}  =  A. 

Proof.  By  Thm.IX.6.2,  Part  I,  \- x  9^  y.  =  .^  y  e  {x},  and  by  Thm. 
IX.6.6,  Cor.  1,  h  ~  2/  c  {x}.  =  .{y\  r\  {x}  =  A. 

Theorem  IX.6.9. 

I.  h(«).n{«i  =«■ 

II.  h(«,/3).n{«,i3}   =  an^. 
III.   \~  (ai,  .  .  .  ,  a^).C\  (ai,  .-.,«„}   =  ai  H  •  •  •  H  a„. 

IV.  h(x,«).n({«}  wx)  =  annx• 
v.  h(«).U{«}  = «. 
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VI.  [-(«,^).U{«,i3l   =  aW/3. 
VII.  h  (a„  ...,  a„).U  l«i,  .  .  .  ,  «n}   =  ai  W  •  .  •  W  a„. 
VIII.  h  (X,a).U({«}  W  X)  =  a  W  UX. 

Proof  of  Parti. 

:(/3).^e  {a}   D  a:  e  |8: 
(/3).|8  =  a  D  re  e  /3: 
X  €  a. 

Proof  o/  Parts  II,  III,  anrf  IV.     Use  Part  I  with  Thm.IX.5.3,  Part  I. 
Proof  of  Part  V. 

^  :(E/3)./3  6  {a}.a:€^: 
=  :(E/3)./3  =  a.a;  e  |8: 
=  :X  e  a. 

Proof  of  Parts  VI,  VII,  a?irf  VIII.     Use  Part  V  with  Thm.IX.5.3,  Part  II. 

We  now  introduce  the  class  of  unit  subclasses  of  A ,  which  we  denote  by 
USC(A).  In  this,  the  letters  U,  S,  and  C  stand  for  "unit,"  ''sub,"  and 
"classes,"  respectively.    Specifically  we  define 

USC(^)        for         {[x]\xeA] 

where  a:  is  a  variable  that  does  not  occur  in  A.    We  also  define 

USC'(A)        for        use  (use  (^)) 

use' (A)      for     use(use'(^)) 

etc. 
We  also  define  the  cardinal  numbers  0  and  1  as  follows : 

0  for         (A} 

1  for        USC(V). 

Except  for  x  (which  is  bound)  and  other  bound  variables  implicit  in  the 
notation  [{x]  \  x  t  A\,  variables  are  bound  or  free  in  USe(A)  according  as 
they  are  bound  or  free  in  A.  eorrespondingly  for  USe^(A),  USe'(^),  etc. 
There  are  no  free  variables  in  0  or  1 . 

USe(^)  is  stratified  if  and  only  if  A  is  stratified,  and  similarly  for 
USe'(A),  USe'(^),  etc.  For  stratification,  USe(yl)  must  be  one  type 
higher  than  A,  USe^(.4)  must  be  two  types  higher,  etc.  0  and  1  are  strati- 
fied and  may  be  assigned  any  types  whatever,  even  to  being  assigned 
different  types  in  the  same  context. 

Theorem  IX.6.iO. 

I.   h  (a)  3(USe(a)). 
*II.   h  {oL,x)'.{x]   e  USe(a).  ^  .X  €a. 
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*III.  (-  (a,x):.x  e  USC(a):  =  :(Ey).y  e  a.x  =   {y}. 
Proof  of  Part  I.     Use  Thm.IX.3.1. 
Proof  of  Part  II.     Use  Thm.IX.6.3,  Cor.  2. 
Proof  of  Part  III.     Use  Thm.IX.3.2,  Cor.  3. 
Corollary  1.    |-  (a)  a(USC'(a)). 
Proof.     Use  Part  I  twice. 

Corollary  2.     |-  (a,x): {{x}}  e  USC'(a).  =  .a:  e  a. 
Proof.     Use  Part  II  twice. 

Corollary  3.     |-  ia,x):.x  e  USC'(a):  =  :(Ey).y  ta.x  =   {{y}}. 
Proof.     Use  Part  III  twice. 
Corollary  4.     \-(x).{x}  el. 
Proof     Put  a  =  V  in  Part  II. 
Corollary  5.     \-  (x):x  e  1,  =  .(Ey).x  =  {y}. 
Proof.     Put  a  =  V  in  Part  III. 
Theorem  IX.6.11.     [- («).U(USC(a))  =  a. 
Proof. 

\:Xe\J([]SC(a)):. 

-    .(E^):^eUSC(a).a:e/?:. 

:.iEl3):Xe^:(Ey).y  ea.0  =   {y}:. 
:.(El3,y):X  e  0.y  e  a.0  =   {y}-.. 
.(Ey):y  ea:(E^).0  =   {y].xel3:. 
.{Ey):y  e  a.x  e  [y]-.. 
:.{Ey):y  =  x.y  e  a.-. 
'..X  e  a. 

Corollary  1.    \-  (a,iS):USC(a)  =  USC(/3).  =  .«  =  /?. 

Proof.     IfUSC(«)  =  USC(^),thena  =  U(USC(«))  =  U(USC(/3))  =  ^. 

Corollary  2.     h3{USC(a)  |  P}.:  D  :.(a):USC(a)  t  {USC(a)  |  P].  =  .P. 

Theorem  IX.6.12. 

I.  h  (a,/3):USC(a  H  /3)  =  USC(a)  H  USC(^). 
*II.  I  (a,^):USC(a  W/3)   =  USC(a)  W  USC(/3). 
III.  h  (a,/3):USC(a  -  ^)  =  USC(a)  -  USC(|8). 
*IV.  h  USC(A)  =  A. 

V.  h  (a,/3):a  C  /3.  =  .USC(a)  C  USC(/3). 
VI.  h  (a,/3):a  C  /3.  =  .USC(a)  C  USC(/3). 

Proo/ o/ Par^  I .  Assume  a:  e  USC (a  n  jS) .  Then  (Ey) .y  can  0.x  =  {y}. 
So  by  rule  C,y  e  a,y  e  0,  x  =  {y}.    So  a;  e  USC(a)  and  x  e  USC(j8).    Hence 

(1)  \-xe  USC(a  n  ^).  D  .x  e  USC(a)  n  USC(^). 

Now  assume  x  e  USC(a)  n  USC(/3).  Then  x  e  USC(a)  and  x  c  USC(/3). 
So  by  rule  C,  y  e  a,  x  =  {y},z  e0,x  =  {z}.  This  gives  {y}  =  {z} ,  y  =  z, 
ye^.    Soy  €  a  n  0.x  =  {y}.    So  x  €  USC(a  n /3). 
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Proof  of  Part  11. 

\-  X  €  USC(a  W  13):. 

(Ey)jJ  eaVJ  ^.X  =   {y}:. 
.{Ey):y  €  a.x  =  {y].y.y  e  ^.x  =  [y]:. 
.(Ey).y  e  a.x  =  \y\:M:{Ey).y  e  /3.x  =  [y]-.. 
.X  €  USC(a).v.x  e  USC(^). 

Proo/  of  Part  IV.  It  suffices  to  prove  \-  {x)^  x  e  USC(A).  That  is, 
|-  {x,y).x  =  {y\  D  ^  y  €  K.    This  is  a  ready  consequence  of  Thm.IX.4.10, 

Part  II. 

Proof  of  Partlll.  Assume  x  e  USC(a  -  0).  Then  by  Part  I,  x  e  USC(a) 
and  X  e  USC(i8).  Then  we  need  to  prove  ~  a:  e  USC(/3),  which  we  do  by 
reductio  ad  absurdum.  Assume  further  x  e  USC(/3).  Then  by  Part  I, 
X  e  USC(i8  n  jS).    By  Part  IV,  this  is  a  contradiction.    So 

(1)  h  USC(a  -  ^)  C  USC(«)  -  USC(/3). 

Now  assume  x  e  USC(a),  ~  a:  e  USC(i8).  Then  {Ey).y  e  a.x  =  [y]  and 
(y).^  =  {?/}  D  ~  ?/  €  /3.  Then  by  rule  C,y  €a,x  =  [y],  so  that  ~  y  e  ^, 
i/ciS.    Theny  £  a  -  iS.a;  =  {?/}.    So  a;  e  USC(a  ~  /3). 

Proof  of  PartY.  \- a  Q  ^:  =  -.a  r^  0  =  a:  =  :USC(an|S)  =  USC(a):  =  : 
USC(a)  n  USC(^)   =  USC(a):  ^  :USC(a)  C  USC(/3). 

Part  VI  follows  readily  from  Part  V. 

Corollary  1.     \-  (/3).USC(/3)  =  1  -  USC(/3). 

Proof.     Put  a  =  V  in  Part  III. 

Corollary  2.     h  (a).USC(a)  C  1. 

Pt-oo/.     Put  /3  =  V  in  Part  V. 

Theorem  IX.6.13. 
I.  \-  (a)  ax{{x}  ea). 
11.  \-  {a,x):X  e  x({x}  e  a).  =  .{x}  e  a. 

Theorem  IX.6.14.  h  («,i3):.a  Q  USC(/3):  D  :i({x}  e  a)  C  /?.«  = 
USC(i(fa:}  ea)). 

Proof.     Assume 

(1)  a  C  USC(/3). 

Letxei({a:}  ea).    Then{x}eQf.    So  {x}  e  USC(/S).    So  a:  e /3.    Hence 

(2)  x{{x}  ea)  Q  13. 

Now  let  X  e  a.  Then  x  e  USC(/3).  So  by  rule  C,  y  e  ,/3,  x  =  {^/j.  So 
successively,  {y}  e  a,  y  e  x{{x}  ea),  {y}  eUSC(i({.T}  e  a)),  a;  e  USC(x({a:}  e 
a)).    Thus 

(3)  a  Q  USC(i({a:}  ea)). 
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Finally,  leta:  cUSC(x({a;!  ea)).    Thenhy  ru\eC,y  e  x(lx]  ea),x  =  {y|. 
From  y  e  x{{x}  e  a)  we  get  {y}  e  a  and  so  x  e  a.    Thus 

(4)  VSC(x{{x]  ea))  e  a. 

Corollary  1.     \-  ia):a  Q  1.  D  .a  =  VSC(x{{x}  e  «)). 
Proof.     Put  13  =  V. 
♦Corollary  2.     \-  K/3):.a  C  USC(/3):  =  :(E7).7  C  /3.a  =  USC(7). 
Proo/.     Combine  with  Thm.IX.6.12,  Part  V. 
Theorem  IX.6.15.     \-  (a).~  A  e  USC(a). 

Proof.     Assume  A  e  USC(a:).    Then  by  rule  C,  y  e  a,  A  =  [y].    Hence 
y  e  A.    This  gives  a  contradiction. 
Corollary  1.     [-  («).USC(a)  5^  0. 
Corollary  2.     [-  1  f^  0. 
♦Theorem  IX.6.16.     h  W.USC({a:})  =  {{x}}. 
Proof. 

[-yeVSCiix]): 

=  :(E2).0e  {x\.y  =  {z]i 
=  :(Ez).z  =  x.y  =   {z\: 
=  :y  =   {x}: 
=  :ye{{x}}. 

We  have  occasional  use  for  the  class  of  all  subclasses  of  A .    We  call  this 

SC(A),  the  S  and  C  standing  for  "sub"  and  "classes,"  respectively.  Specifi- 
cally, we  put 

SC(A)  for  3(/5  Q  A), 

SC'(A)        for  SC(SC(^)), 

SC'(^)        for  SC(SC'(^)), 

etc., 

where  in  the  definition  of  SC(A),  /J  is  a  variable  which  does  not  occur  in  A. 
Some  people  refer  to  SC(^)  as  the  power  class  of  A. 

Except  for  the  bound  variables  implicit  in  3  and  Q,  a  variable  is  free  or 
bound  in  SC(il)  according  as  it  is  free  or  bound  in  A. 

SC{A)  is  stratified  if  and  only  if  A  is  stratified,  and  similarly  for  SC^{A), 
SC^{A),  etc.  If  A  is  stratified,  then  the  type  of  A  is  one  less  than  the  type 
of  SC(A),  two  less  than  the  type  of  SC^(yl),  etc. 

One  will  commonly  read  A  e  SC(B)  as  "A  is  a  subclass  of  B." 

Theorem  IX.6.17. 

I.  I-  (a).a(SC(a)). 
*II.  h  {oi,l3):0  e  SC(a).  ^  .(3  Q  a. 

Theorem  IX.6.18. 

I.  h  ia).a  e  SC(a). 
II.  h  («).A  e  SC(a). 
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III.  h  (a,0).a  n  /3  e  SC(a). 

IV.  \-ia,x):Xia.  =  .{a:}  e  SC(a).     ' 
CoroUary  1.    \-  (a).USC(a)  C  SC(a). 
Proof.     Use  Part  IV. 

CoroUary  2.    h  («).USC(a)  C  SC(a). 

Proo/.     Use  Part  II,  Thm.IX.4.21,  and  Thm.IX.6.15. 

Theorem  IX.6.19. 
I.  [-  SC(A)  =0. 
II.  [-SC(V)  =  V. 

Theorem  IX.6.20.     \-  (a).U(SC(a))  =  a. 

Proof.  Assumes  tU(SC(a)).  Then  by  rule  C,  ,8  e  SC(a)  and  x  €,S.  So 
13  Q  a  and  x  e  ^.  So  a;  e  a.  Conversely,  let  x  e  a.  Then  a  e  SC(a).x  «  a. 
So  (E/3).^  6  SC(a).a;  6  /?.    Hence  x  e  U(SC(a)). 

Corollary  1.     \-  (a,/3):SC(a)  =  SC(/3).  =  .a  =  /3. 

Corollary  2.     [-  3{SC(a)  |  P}.:  D  :.(«):SC(a)  e  {SC(a)  I  P}.  ^  .P. 

EXERCISES 

IX.6.1.     State  what  are  the  members  of  SC({a;})  and  SC({a:,y}). 
IX.6.2.     Frove\-(x,y,u,v):{{x},  {x,y}}  =  {{u},  {u,v}}.  ^  .x  =  u.y  =  v. 
(Hint.    hn({{^},  {x,y]})  =  {x}  andhUdl^},  {x,y}})  =  {x,y].) 
IX.6.3.     Prove  h  (X).X  Q  SC(UX). 
IX.6.4.     Prove  h  (a).n(SC(a))  =  A. 
IX.6.5.     Prove: 

(a)  h  (a,^):a  C  ^.  =  .SC(a)  C  SC(/3). 

(b)  h  (a,/3):a  C  ^.  =  .SC(a)  C  SC(^). 

IX.6.6.     Prove: 

(a)  h  (X,iu).SC(X  n  m)  =  SC(X)  n  SC(m). 

(b)  [-  (X,m).SC(X  VJ/x)   =    {a  W  ,8  I  a  6  SC(X)./3  e  SC(m)}. 
IX.6.7.     Prove: 

(a)  h  (X):X  ?^  A.  D  .USC(nX)  =  n{USC(a)  |  a  e  X}. 

(b)  h  (X).USC(UX)  =  U{USC(a)  I  a  6  X}. 

IX.6.8.  Indicate  why  one  needs  the  hypothesis  X  ?^  A  in  Ex.IX.6.7(a) 
but  not  in  (b). 

7.  Variables  over  the  Range  2.  We  have  discussed  restricted  quantifica- 
tion in  Chapter  VI,  Sec.  8,  and  elsewhere.  We  now  discuss  a  more  inclusive 
concept,  namely,  variables  of  restricted  range. 

Suppose  we  are  particularly  interested  in  members  of  some  class  2,  and 
we  choose  certain  variables,  x,  y,  z,  which  shall  denote  only  members  of  S. 
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Common  instances  would  be  where  2  is  the  class  of  integers,  or  the  class 
of  real  numbers,  or  the  class  of  complex  numbers,  or  the  class  of  vectors,  etc. 

This  situation  is  by  all  odds  the  most  commonly  occurring  one  in  mathe- 
matics. Almost  never,  except  in  cases  of  the  most  extreme  generality,  is 
there  need  for  completely  unrestricted  variables,  such  as  we  have  been  using 
so  far  in  this  chapter.  Accordingly,  to  put  our  class  calculus  in  a  form 
suitable  for  use  in  most  mathematical  disciplines,  we  must  develop  care- 
fully the  technique  of  variables  of  restricted  range. 

A  part  of  the  technique  is  the  use  of  restricted  quantification.  If  x  is 
restricted  to  be  a  member  of  S,  then  our  conventions  on  restricted  quanti- 
fication provide  that  (x)  F(x)  and  (Ex)  F(x)  shall  denote  (u).u  e  S  D  F(u) 
and  (Eu).u  e  'Z.F(u),  respectively. 

Still  further  conventions  are  needed  in  order  to  handle  the  class  calculus 
effectively,  particularly  a  convention  as  to  the  meaning  of  xF{x).  To 
investigate  these  conventions,  we  assume  throughout  the  rest  of  this  section 
that  X,  y,  and  z  are  restricted  to  the  range  2,  so  that  (x)  F{x)  and  (Ex)  F{x) 
are  to  be  interpreted  as  above. 

For  the  present  discussion,  2  may  be  a  variable  or  a  term  of  the  form  lw  P. 
Clearly,  since  the  range  of  x,  y,  and  z  is  to  depend  upon  2,  it  would  not  do 
for  the  variables  x,  y,  or  z  to  have  free  occurrences  in  the  term  2.  Otherwise 
2  is  completely  at  our  disposal. 

Theorem  IX.7.1.     \-  {x).x  t  a  =  x  e  a  r\i:. 

Proof.     By  Thm.VI.6.1,  Part  LXVII, 

[-we  2.  D  .'Uea  =  M€Q:n2. 

Hence  we  see  that  x  ta  might  as  well  be  replaced  by  x  e  a  n  2  everywhere. 
We  may  think  of  a  n  2  as  the  residue  of  a  (modulo  2).    We  shall  say 
that  a  is  congruent  to  fi  (modulo  2)ifa:n2  =  /3n2. 
Theorem  IX.7.2.     \- {x).x  €  a  ^  x  t  ^:  =  xa  r\^  =  /S  n  2. 
Prooj.     By  Thm.VI.6.1,  Part  LXVII, 

[-W€2.  D.Mca^  XL  e  a  n  2, 

1-We2.  D.Me/?  =  Me/3n2. 

So 

|-We2.  D  .Mea  =  W€j3:  =  :M€an2  =  Mc/3n2. 

Hence 

|-  (w):W  €  2.  D   .w  e  a  =  w  e  |S.:  s  :.(m):w  €ar\2  =  wejen2. 

This  is  our  theorem. 

This  theorem  substantiates  our  earlier  conclusion,  based  on  Thm.IX.7.1, 
that,  when  we  are  dealing  with  variables,  x,  y,  z,  restricted  to  2,  we  might 
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as  well  use  a  Pi  S  in  place  of  a  whenever  one  has  x  t  a.  That  is,  one  can 
deal  with  classes  modulo  2  as  far  as  membership  of  x,  y,  z  in  those  classes 
is  concerned. 

We  note  that  a  n  S  is  a  subclass  of  S.  So  if  we  are  replacing  a  by  cc  n  S 
for  certain  classes,  we  are  essentially  restricting  attention  to  subclasses  of  2. 
Accordingly,  if  we  are  dealing  with  x,  ?/,  and  z  restricted  to  S,  it  would  be 
natural  to  restrict  a,  /?,  and  7  to  SC(2),  and  we  do  so  for  the  rest  of  this 
section.  This  is  a  further  part  of  the  technique  of  variables  restricted  to  S. 
So  now  (a)  F{a)  and  (Ea)  F{a)  denote  (5). 5  C  2  D  F{h)  and  (E5). 
5  C  X.FiS),  respectively.  With  a,  /3,  and  7  serving  as  restricted  variables, 
we  can  now  prove  a  theorem  which  looks  like  our  definition  of  "  —  ". 

Theorem  IX.7.3.     \-  (a,l3):.(x).x  e  a  =  x  e  0:  =  -.a  =  0. 

Proof.  Written  with  unrestricted  class  variables,  our  theorem  takes  the 
form  (d,(f>)::d  C  2,(^  C  2.:  D  :.(x).x  e  d  =  x  e  (f):  =  -.d  =  (t>.  Assume  0  C  2 
and  0^2.  Then  d  =  6  r^X  and  0  =  0  n  2.  So  by  Thm.IX.7.2,  (x). 
x  e  6  =  X  €  <f>:  =  :0  =  <i). 

In  accordance  with  our  conventions  in  Chapter  VIII,  Sec.  2,  in  order  to 
interpret  ta  F(a)  we  must  choose  an  A  such  that  A  C  2.  We  choose  A  for 
this  A,  and  then  la  F(a)  is  to  denote 

c5  (8  =  L<t>(<l>  Q  2.F(<^)).(Ei<^).(/)  C  2.i^(<^):v:5  =  A.'^(E^<t>).<t>  C  2.F(</))). 

In  view  of  this,  and  recalling  that  xF(x)  is  an  abbreviation  for  ta  (x). 
X  e  a  =  F(x),  we  see  that  xF(x)  denotes 

i5  (5  =  10(0  C  2:(w):M  eX.  D  .U  ecf)  =  F(w)):.(Ei0):0  C  'E:(u):U  e  2.  D  . 
u  e<t>  =  F(u).:y:.8  =  A:.~(Ei0):0  C  2:(w):W  e  2.  D  .u  e  4>  ^  F(u)). 

Actually,  from  the  intuitive  interpretation  that  one  would  wish  to  give 
xF(x)  when  x  denotes  a  member  of  2,  one  would  wish  xF(x)  to  mean 
Hiu  e  2.F(m)).  We  shall  prove  that,  if  the  class  denoted  by  xF(x)  exists, 
then  indeed  xF(x)  =  u{u  e  '2,.F{u)). 

Theorem  IX.7.4.  \-  (2,0)::0  C  2:(w):M  e  2.  D  .w  e  0  =  F{u).i  =  :.(w) 
:w  €0.  =  .M  e  X.F{u). 

Proof.     Straightforward. 

Theorem  IX.7.5.  If  a  is  a  variable  which  does  not  occur  in  F(x),  then 
h  (Ea)(a:).a:  e  a  =  F(x);   =  :3w(w  e  2.F(m)). 

Proo/.  Note  that  (Ea)(x).x  e  a  =  F(a;)  is  (E0):0  C  2:(m):?^  e  2.  D  . 
u  i(f)  ^  F{u),  and  SwCw  e  ^.F(u))  is  (E0)(w):w  e  0.   =  .i^  e  i:.F(u). 

Note  that  (Ea)(a;),x  e  a  =  P(a;)  is  just  the  formula  which  we  would  abbre- 
viate to  3.xF(x)  if  the  variables  x  and  a  were  unrestricted.  So  essentially 
the  theorem  just  proved  says  \-  3.xF(x)  =  3.u(u  e  2.F(m)). 

Theorem  IX.7.6.     \-  M{u  e  X.Fiu)):  D  :xF{x)  =  ^(w  e  2.F(w)). 
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Proof.  Assume  3w(m  eS./^(M)).  Then  by  Thm.IX.3.2  and  Thm.IX.7.4, 
(<t>)::iX{u  e  ^.F{u))  =  </>.:  =  :.</>  Q  2:(w):ii  e  2.  D  .w  e  0  =  F(w).  So  (E5):: 
5  =  0.:  =  :.(/)  C  2:(m):M  e  S.  D  .W  e  0  =  /^(?/).  That  is,  (Ei</)):.0  C  2:(u): 
w  €  S.  D  .?(  e0  =  F(u).  Then  by  Thm.VIII.2.10,  Part  I,  and  Thm.VII.2.3, 
xFix)  =  L(f>(4>  C  X:(u):u  €  S.  D  .u  e  ^  =  F(u)).  However,  by  Axiom 
scheme  9  and  Thm.IX.7.4,  \-  L(i>{(i>  Q  '2:(ii):u  e  2.  D  .u  e  <t>  ^  F(u))  = 
u(u  6  2.F(m)). 

Corollary.  \-  (Ea){x).x  e  a  =  F(x):  D  :xF(x)  =  u(u  e  Z.F(u))  if  a  is  a 
variable  which  does  not  occur  in  F(x). 

We  notice  that  in  our  interpretation  of  xF{x)  we  are  applying  to  the  im- 
plicit occurrence  of  a  the  rule  that  any  variable  which  is  to  be  associated 
with  X,  y,  or  3  in  a  relation  of  membership  is  to  be  restricted  to  SC(2).  If  a 
occurs  as  a  member  of  some  X,  then  the  same  rule  would  call  for  restricting 
X  to  SC'(2).    Then  by  Thm.IX.7.6,  we  would  have 

\-  35(5  Q  X.G(8)):  D  :aG(a)  =   5(5  C  2.(7(5)). 

Similarly,  if  X  occurs  as  a  member  of  some  class,  that  class  should  be 
restricted  to  SC^(2).  Thus  we  find  that  there  arises  a  natural  sort  of 
hierarchy  of  types. 

By  Thm.IX.7.3,  our  definition  of  equality  of  a  and  /S  is  valid  in  terms  of 
variables  of  restricted  range.  This  suggests  that  the  previous  theorems  of 
this  chapter  remain  true  if  interpreted  as  involving  variables  of  restricted 
range.  This  is  true  with  reservations.  As  we  noted  above,  in  combinations 
like  X  e  a,  the  ranges  of  x  and  a  have  to  be  properly  adjusted.  If  x  is 
restricted  to  2,  then  a  should  be  restricted  to  SC(2).  More  generally,  if 
there  occurs  such  a  formula  as  (x):.x  e  C\\:  =  :(a).a  e\  D  x  e  a,  then  if  we 
wish  X  restricted  to  2,  we  must  not  only  have  a  restricted  to  SC(2)  but  X 
restricted  to  SC'(2). 

Thus,  in  dealing  with  variables  over  a  restricted  range,  we  have  imposed 
upon  us  a  sort  of  natural  hierarchy  of  types.  Consequently,  in  dealing  with 
variables  over  restricted  ranges,  we  can  hope  to  carry  over  only  those  theo- 
rems which  would  fit  into  such  a  hierarchy  of  types.  Subject  to  this  con- 
dition, however,  the  theorems  not  involving  restricted  variables  carry  over 
into  theorems  involving  restricted  variables.  Certain  other  minor  changes 
are  needed.  One  such  change  is  that,  if  a  certain  variable  is  restricted  to  2, 
then  2  serves  as  the  universal  class  for  that  variable.  This  follows  from 
Thm.IX.7.1,  which  with  Thm.IX.7.3  tells  us  that 

\-  {a).a  =  a  n  2, 

which  is  analogous  to  the  property 

h  (5). 5  =  5  n  V. 
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Another  change  is  that  complementation  must  be  replaced  by  comple- 
mentation with  respect  to  S,  That  is,  analogous  to  8  for  unrestricted 
variables,  we  must  use  S  —  5  for  variables  of  restricted  range. 

One  should  note  how  the  type  hierarchy  is  working  here.  If  we  have 
.Tea  and  a  e\  Occurring  and  x  restricted  to  S,  then  the  universal  class  for 
a:  is  S,  the  universal  class  for  a  is  SC(S),  and  the  universal  class  for  X  is 
SC^(S).  However  in  such  formulas  as  a  n  V,  the  V  which  must  be  used 
is  the  V  of  the  same  type  as  a,  which  is  the  V  for  the  members  of  a,  namely, 
2.  However,  if  one  has  X  n  V,  then  one  must  use  the  V  of  the  same  type  as 
X,  and  for  this  V  we  would  have  to  use  the  class  to  which  a  is  restricted, 
namely,  SC(S).  Accordingly,  in  performing  restricted  complementation, 
the  restricted  complement  of  a  would  be  S  —  a,  but  the  restricted  comple- 
ment of  X  would  be  SC(S)  -  X. 

In  other  words,  the  type  distinctions  of  our  hierarchy  of  types  must  be 
very  carefully  adhered  to. 

It  does  not  noticeably  complicate  the  formulas  to  use  the  appropriate 
one  of  2,  SC(2),  etc.,  for  V  at  the  proper  places  when  dealing  with  variables 
of  restricted  ranges,  and  it  helps  us  keep  the  range  in  mind.  However,  we 
shall  simplify  S  —  a,  SC(2)  —  X,  etc.,  to  just  a,  X,  etc.,  and  expect  the 
reader  to  remember  that  complementation  for  variables  of  restricted  ranges 
is  restricted  complementation. 

It  is  not  completely  trivial  to  show  that  the  previous  theorems  of  the 
present  chapter  are  valid  when  so  interpreted,  and  we  devote  the  rest  of 
the  section  to  verifying  this  fact. 

We  start  with  the  axioms.  All  but  Axiom  scheme  12  have  been  covered 
by  earlier  discussions  dealing  with  restricted  quantification. 

Theorem  IX.7.7.     |-  {X).MP  D  M(u  e  2.P). 

Proof.  Assume  MP.  Then  by  Thm.IX.3.2,  Cor.  l,ueuP.  -  .P.  So 
W  e  S.w  €  HP:  =  :U  e  2.P.  Thus,  w  e  2  n  wP:  =  :W  e  2.P.  So  (E0)(w): 
u  e<t>.  ^  .u  e  2.P.    That  is,  3m(w  e  2.P). 

Theorem  IX.7.8.  If  P  is  stratified  and  x  and  a  are  distinct  variables  and 
a  does  not  occur  in  P,  then  \-  (Ea){x).x  e  a  =  P. 

Proof.    Use  Thm.IX.7.7  and  Thm.IX.7.5. 

We  may  as  well  agree  that,  when  the  variable  x  is  restricted  to  2,  then 
3:cP  shall  denote  (Ea){x).x  e  a  =  P  where  a  is  a  variable  which  does  not 
occur  in  P  and  is  restricted  to  SC(2).  Then  our  analogue  of  Axiom  scheme 
12  says  that,  if  P  is  stratified,  then  3xP. 

The  theorems  in  Sec.  2  of  the  present  chapter  are  readily  proved  when 
written  with  restricted  variables.  Indeed,  the  restricted  form  is  a  ready 
consequence  of  the  unrestricted  form. 

Turning  to  Sec.  3,  we  note  that  we  have  taken  care  of  Thm.IX.3.1  by 
Thm.IX.7.8.    A  useful  lemma  for  later  theorems  is: 
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Theorem  IX.7.9.     [-  a£P.  D  .xP  c  s. 

Proof.  Assume  3xFix).  Then  by  Thm.IX.7.5,  Hw(w  e  2.F(w)).  So  by 
Thm.IX.3.2,  Cor.  1,  u  eu(u  e  S.F(w)).  D  .u  e  S.  So  u(u  e  2./^(w))  C  2 
Accordingly,  by  Thm.IX.7.6,  £F(x)  e  S. 

Corollary,     h  3^^-  ^  -^^  =  ^P  r\i:. 

Theorem  IX.7.10.     \-  MP-.  D  ■.{x).x  exP  =  P. 

Proof.  Assume  :!lxF(x).  Then  3-11(1^  e  X.F(u)).  So  m  €w(?i  €  2:.F(w)).  =  . 
w  e  i:.F{u).  So  (w):w  e  2.  D  .w  6  w(w  e  S.i^(w))  =  F(m).  That  is,  (x). 
X  eu(u  e  '2.F(u))  =  ^(a:).    However,  w(w  e  Z.F(u))  =  xF(x). 

By  use  of  this  theorem  and  Thm.IX.7.3  (with  an  assist  from  Thm. 
IX.7.9),  we  can  prove  that,  if  there  is  no  free  jS  in  P,  then  |-  3.xP.:  D  -.. 
(13):^  =  xP.  =  .lx).x  el3  =  P.    This  takes  care  of  Thm.IX.3.2. 

With  regard  to  { A  |  P} ,  it  seems  clear  that,  if  some  of  the  y's  which  occur 
free  in  both  A  and  P  are  subject  to  certain  restrictions,  one  has  merely  to 
subject  these  y's  to  the  same  restrictions  in  the  definition  w(E?/i,  t/2,  .  .  .  , 
yn).u  =  A.P  in  order  to  make  {^  |  P}  have  the  desired  characteristics. 
In  case  A  is  such  a  function  oi  yi,  y2,  .  .  .  ,  yn  that  we  have  \-  (yi,  y2,  .  .  .  , 
yn).A  €  S,  then  one  can  prove  [-  3.{A  \  P}.  D  .u((Eyi,  yo,  .  .  .  ,  yn).u  =  A.P) 
=  x((Eyi,  yz,  .  .  . ,  yn).x  =  A.P).  In  such  case  one  has  the  option  of  writing 
^(Eyi,  y2,  ■  •  .  ,  y,).x  =  A.P  for  {A  \  P}  if  it  serves  any  useful  purpose 
(such  as  uniformity  of  notation) . 

With  this  understanding  about  {A  \  P\,  the  reader  can  check  the  proofs 
of  the  remaining  theorems  of  Sec.  3  and  verify  that  the  proofs  given  for 
unrestricted  variables  generalize  to  the  case  of  restricted  variables.  We 
might  just  remark  with  regard  to  Thm. IX. 3. 6,  Cor.  1,  that  one  would  have 

\-  i8).x(x  e  5)  =  5  O  2 
but 

\-  (a).£{x  e  a)  =  a. 

As  a  lemma  for  the  theorems  of  Sec.  4,  we  prove: 
Theorem  IX.7.11. 
I.  \-  (x).x  eX  =  x  =  X. 
II.  \-  (x).x  e  A  =  X  9^  X. 

III.  \-  {a,l3,x):X  e  a  n  P.  =  .x  i  a.x  e  fi. 

IV,  1-  {a,^,x):X  €  a  \J  ^.  =  .X  €  a.y.X  e  /3. 
V.  \-  {a,x)'.X  ea.  =  .^^  x  e  a. 

Proof  of  Part  I.  We  note  that  it  is  just  ]-  (w):W  e  S.  D  .u  e'Z  =  x  =  x, 
which  is  easily  proved. 

Parts  II,  III,  and  IV  are  trivial.  _To  prove  Part  V,  note  that  it  is 
\-  (5):. 5  C  2:  D  :(x):X  e  2  -  5.  =  .a:  €  5,  which  follows  by  Thm.IX.7.1. 

With  the  help  of  this  result,  we  can  carry  through  the  proofs  for  restricted 
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variables  of  the  theorems  of  Sees.  4,  5,  and  6  with  no  great  difficulty.    We 
must  remember  to  replace  V  by  S  in  all  places. 

In  the  proof  of  Thm.IX.4.6,  we  hit  a  slight  snag.  The  use  of  free  x's  in 
the  proof  involves  the  assumption  x  e  H.  Hence,  we  conclude  not  \-  a  9^  a, 
hut\-xe'^  D  a9^a.    That  is,  Thm.IX.4.6  becomes  \- i:  9^  A.  D  .(ex)  .a  ^  a. 

The  reader  will  recollect  that,  in  dealing  with  restricted  quantification, 
one  needed  to  know  that  (Ex)  K{x) ;  analogously  in  dealing  with  variables 
over  the  range  2  we  shall  need  to  know  (Ea;)  a;  e  2  or  2  5^  A.  If  we  always 
use  a  2  such  that  [-  2  5^  A,  then  we  can  carry  out  all  theorems  of  Sees.  4,  5, 
and  6  for  restricted  quantification. 

The  {a}  of  Thm.IX.6.9  denotes  5(5  ^  2.5  =  a)  rather  than  5(5  e  2. 
5  =  a),  of  course.  This  is  a  consequence  of  adhering  to  our  type  hierarchy. 
Correspondingly,  we  would  have  to  write  Thm.IX.6.10,  Part  III,  in  the 
form 

h  («,^):./3  6  USC(a):  ^  ■.{Ey).y  e  a./3  =  {y\ 

so  that  it  would  be  interpreted  as 

|-  (0,w)::0  ^  2.M  e  2.:  D  :.M  e  USC(<^):  =  '.{Ev).v  e  2.V  €  0.W  =   \v] 
rather  than  as 

h  {4>,y):-<t>  Q  2.W  e  2.:  D  :.u  e  USC(<^):  =  :(Ev).v  €  2.y  e  <t>.U  =   {v}. 

So  far  as  we  know,  the  latter  is  not  provable. 

In  Cor.  3  of  Thm.IX.6.10,  one  must  write  a  X  in  place  of  x  for  similar 
reasons. 

This  illustrates  what  we  said  earlier,  that  use  of  restricted  variables 
forces  upon  us  a  hierarchy  of  types  which  must  be  rigidly  adhered  to. 

Likewise  in  Thms.IX.6.13  and  IX.6.14,  one  should  write  X  in  place  of  a 
to  ensure  the  right  sort  of  restrictions  being  implied  in  the  use  of  the  vari- 
able. 

EXERCISES 
IX.7.1,      Prove  [-  (2,0)::(^  C  2:(w)rU  e  2.  D  .w  €(/>  =  P.:  =  :.(w):M  €0.  =  . 

U  €2. P. 

IX.7.2.     Criticize  the  following  proof. 

[-  W  €  2.  D   .w  e  0  D  w  e  2. 
So 

|-  (x).x  €  0  D  X  €  2. 
So 

h0C  2. 

IX.7.3.     Prove  \-xP  Q-E. 

IX.7.4.     Prove  \-  (8):x{x  e  5)  =  5  n  2. 
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8.  Applications.  As  an  illustration  of  the  use  of  classes  in  mathematics, 
we  shall  consider  Hausdorff  spaces. 

A  Hausdorff  space  is  a  set,  S,  of  points  such  that,  with  each  point,  x,  of  2 
is  associated  a  set  of  neighborhoods,  H(x),  of  x,  the  neighborhoods  being 
needed  to  satisfy  certain  requirements  known  as  the  Hausdorff  axioms. 
The  Hausdorff  axioms  are: 

Oa.  {x,a):X  €  S.a  e  H(x).   D   .a  C  2. 

Ob.  S  5^  A. 

la.  (x):x  e  S.  D  .H(x)  9^  A. 

lb.  {x,a)\X  e  S.a  e  H{x).   D   .x  e  a. 

2.  (x,a,l3):X  e  2.a,/3  e  H(x).   D   .(Ey).y  e  H(x).y  Q  a  r\  ^. 

3.  (x,y,a)'.x,y  e  Z.a  e  H(x)jj  e  a.  D  .(E/3)./3  e  //■(?/)./3  C  «. 

4.  (x,y):x,y  e  S.x  ^  y.  D  .(Ea,^).a  e  H{x).0  e  H{y).a  n  /3  =  A. 

We  agree  that  H{x)  shall  be  considered  as  stratified  and  have  type  two 
higher  than  x.  Then  the  axioms  given  above  are  stratified.  If  one  wishes 
to  write  a  definition  of  H{x)  in  some  special  case,  these  conditions  of  strati- 
fication must  be  adhered  to. 

We  can  consider  that  x  is  the  only  variable  which  has  free  occurrences  in 
H{x).  In  some  applications,  it  may  be  desirable  to  let  H{x)  contain  free 
occurrences  of  other  variables  w^hich  would  serve  as  parameters.  If  these 
additional  variables  are  distinct  from  all  variables  which  appear  in  the 
developments  of  this  chapter,  the  effect  for  these  developments  would  be 
the  same  as  though  x  were  the  only  free  variable  in  H{x).  Thus  it  is  quite 
possible  for  such  extra  variables  to  be  used  as  parameters,  which  are  effec- 
tively unchanging  throughout  the  developments  of  the  present  chapter. 

Of  these  seven  axioms.  Axioms  Oa  and  Ob  are  never  stated,  though  they 
are  always  implicitly  assumed.  Axioms  la  and  lb  are  usually  not  stated 
separately,  but  their  logical  product  is  stated  as  a  single  axiom,  the  "first 
Hausdorff  axiom."  Thus  it  comes  about  that  one  usually  refers  to  the 
"four  Hausdorff  axioms"  for  a  Hausdorff  space. 

There  is  no  agreement  as  to  just  how  general  a  Hausdorff  space  should  be. 
Actually  spaces  of  differing  degrees  of  generality  are  often  considered,  the 
different  degrees  of  generality  being  achieved  by  stating  Axiom  4  in  different 
forms  with  varying  degrees  of  restrictiveness.  In  this  connection,  see 
Hausdorff,  1927,  especially  Sec.  40,  page  229. 

The  form  of  Axiom  4  which  we  give  here  is  one  of  the  more  commonly 
used  forms. 

In  order  to  put  the  axioms  easily  into  our  symbolism,  we  have  deviated 
slightly  from  the  usual  mode  of  statement.  We  use  H{x)  for  the  set  of  all 
neighborhoods  of  a  point,  whereas  in  the  usual  formulation  iV^  is  used  to 
deDote  an  unspecified  neighborhood.    A''^  would  then  be  a  member  of  H{x) . 
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For  comparison  we  quote  a  presentation  of  the  Hausdorff  axioms  (see 
Bohnenblust,  1937,  Chapter  4,  Sec.  4.1).' 

"According  to  Hausdorff,  a  space  is  a  set  of  elements,  in  which  certain 
subsets  will  play  a  distinguished  role.  These  are  called  neighborhoods.  To 
be  regarded  as  neighborhoods,  a  class  of  subsets  must  satisfy  the  following 
axioms. 

"(1)  Any  element  x  has  at  least  one  neighborhood  A''^,  and  is  an  element 
of  any  one  of  its  neighborhoods. 

"(2)  If  there  are  two  neighborhoods  of  x,  then  there  exists  a  neighborhood 
of  X  which  is  contained  in  each  of  these. 

"(3)  If  y  lies  in  N^,  then  there  exists  a  neighborhood  Ny  of  y  which  is 
contained  in  A^^- 

"(4)  If  X  and  y  are  distinct,  there  exist  an  A^^^  and  an  Ny  with  no  common 
element." 

We  note  the  following  examples  of  Hausdorff  spaces. 

A.  S  is  the  set  of  rational  points  between  zero  and  one.  The  neighbor- 
hoods are  the  open  intervals,  going  from  n/m  to  (n  +  l)/w  for  all  pairs  of 
integers  n  and  m  with  0  <  n  <  m. 

The  neighborhoods  of  a  point  x  are  just  those  neighborhoods  which 
contain  x.    So 

-^  71  71  ~\~   1  "*"/  71  71  ~\~   1 

H(x)  =  a(E7n,n):0  <  n  <  m.—  <  x  < .a  =  y[—  <  y  < 

Here  we  are  using  restricted  ranges  on  m,  n,  and  y  with  w  and  n  restricted 
to  integers  and  y  to  rational  numbers. 

B.  S  is  the  set  of  all  real  points  on  the  line.  The  neighborhoods  of  a 
point  x  are  all  open  intervals  which  contain  x.    So 

H{x)  =  a(Es,5)  a  =  y(x  -  e  <  y  <  X  -\-  d). 

Here  we  are  using  restricted  ranges  on  e  and  5,  namely,  the  usual  ones, 
that  e  and  d  are  real  and  positive.  It  is  not  necessary  to  use  a  restricted 
range  for  y,  since  the  condition  x—  z<y<x-{-8  implies  that  y  is  real. 

C.  S  is  the  set  of  all  complex  points  in  the  complex  plane.  The  neighbor- 
hoods of  a  point  z  are  all  open  circles  with  center  at  z.    So 

H{z)  =  a(ER)  a  =  w(\w  -  z\  <  R). 

Here  we  are  using  restricted  ranges  on  R  and  w,  with  R  restricted  to  positive 
real  numbers  and  w  to  complex  numbers. 

We  now  prove  a  few  of  the  fundamental  theorems  about  Hausdorff 
spaces.    To  simplify  notation,  we  use  x,  y,  and  z  with  a  restricted  range, 

1  From  "Theory  of  Functions  of  Real  Variables"  by  H.  F.  Bohnenblust,  published 
1937  at  Princeton  University,  quoted  by  permission. 
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namely,  the  range  2.    Then  a,  13,  and  y  are  restricted  to  SC(2).    Also  the 
universal  class  of  the  type  of  a  is  2.    Further,  to  avoid  collision  with  the 
notation  for  the  closure  of  a  set,  we  use  —a  for  i('-^  x  e  a). 
With  variables  of  restricted  ranges,  the  Hausdorff  axioms  take  the  form : 

Oa.  (x).H(x)  C  SC(2). 
Ob.  2  ?^  A. 

la.  {x).H(x)  7^  A. 
lb.  \x).x  e  C\^{x). 

2.  (a:,a,/3):a,/3  e  E{x).  D  .(E7).t  e  li{x).y  Q  a  n  ^. 

3.  {x,y,a):a  e  H(x).y  e  a.  D  .(E,Q)./3  e  H{y).l3  C  «. 

4.  lx,ij):X  9^  y.D  .(Ea,/3).a  e  i7(a:)./3  e  i7(t/).a  n  /3  =  A. 

We  quote  some  definitions  from  Bohnenblust,  1937.^ 

^^ Definition.  Any  subset  *S  of  the  given  space  will  be  called  open  with 
respect  to  a  set  of  neighborhoods  if  for  any  element  x  m.  S  there  exists  an 
N^  contained  in  S." 

'^ Definition.  A  subset  of  a  space  is  said  to  be  closed  if  its  complement  is 
open  in  the  space." 

^^ Definition.  We  call  x  a  limit  point  of  the  set  S  if  every  open  set  con- 
taining X,  whether  x  belongs  to  S  or  not,  contains  at  least  one  element  of  8 
different  from  x." 

^^ Definition.  The  set  of  limit  points  of  S  is  called  the  derived  set  of  S  and 
is  designated  by  *S'." 

^'Definition.     ^  +  >S'  is  called  the  closure  of  S  and  is  denoted  by  *S." 

In  translating  these  into  symbolic  logic,  we  define  not  the  term  ''open 
set"  but  the  class  of  open  sets.  If  we  let  "OS"  stand  for  the  class  of  open 
sets,  then  the  English  statement  "x  is  an  open  set"  can  be  translated  as 
"x  €  OS".  That  is,  in  symbohc  logic,  we  do  not  have  available  both  of  the 
two  alternative,  but  synonymous,  statements: 

''x  is  an  open  set." 

"a:  is  a  member  of  the  class  of  open  sets." 

We  have  only  the  second.  Clearly  this  will  suffice  for  mathematical 
purposes,  though  it  makes  our  symbolic  logic  statements  rather  stilted  if 
translated  literally. 

Our  earlier  use  of  H{x)  instead  of  N^  is  another  instance  of  the  same  thing. 
Instead  of  saying  that  a  (or  A''^)  is  a  neighborhood  of  x,  we  say  that  a  is  a 
member  of  the  class  of  neighborhoods  of  re  {a  e  H(x)). 

Similarly  we  define  the  class  "CS"  of  closed  sets.  Further,  where  Boh- 
nenblust defines  both  "limit  point"  and  class  of  limit  points  (derived  set), 
we  get  along  with  only  the  latter,  and  write  "x  is  a  limit  point  of  S"  as 
"x  e  S'". 

^Ibid. 
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Our  definitions,  using  a  instead  of  Bohnenblust's  S,  are: 

OS  for         d(x):X  €  a.  D  .(E/3).,8  e  H(x).l3  C  a. 

CS         for         d(-a  e  OS). 

a'  for         i(/3):x  €  /3./3  e  OS.  D  .(E?/).?/  ^  x.y  e  /3.2/  e  a. 


a 


for         a  U  a' 


Clearly  OS,  CS,  and  a'  are  defined  by  stratified  statements,  so  that  we 
may  apply  Thms.IX.7.8  and  IX. 7. 10  and  infer: 

(«):.«  €  OS:  =  :{x):X  e  a.  D   .(EjS).^  e  /^(a:).j8  C  a. 

(q:):.q:  e  CS.  =  .  — a  e  OS. 

(a;):.a;  e  a':  =  :(/3):X  e  /J./?  e  OS.  3  .(E?/).?/  ?^  x.?/  e  /3.?/  e  a. 

In  the  hierarchy  of  types  which  accompanies  the  use  of  a  restricted  range 
of  values  for  the  variables,  OS  and  CS  must  have  the  same  type  as  SC(2), 
and  a  and  a  must  have  the  same  type  as  S.  Adherence  to  this  type  hier- 
archy ensures  stratification. 

Theorem  IX.8.1.     (a,/3):a,/3  e  OS.  D  .a  n  /3  c  OS. 

In  words,  "The  intersection  of  two  open  sets  is  open". 

Proof.  Assume  a,/3  e  OS  and  a  ^  S  and  /3  C  S.  Now  let  re  e  S  and 
X  e  a  r\  ^.  Then  x  t  a  and  re  e  /3.  Then  by  rule  C  and  the  definition  of  OS, 
7  e  2,  7  e  H(x).y  C  a,  5  C  S,  and  8  e  H(x).8  C  0.  So  by  Axiom  2  and 
rule  C,  (^  C  S  and  0  e  H{x).(t>  Q  y  n  8.    Then  <l)  Q  a  r\  ^. 

Theorem  IX.8.2.     (X):X  Q  OS.  D  .\J^  e  OS. 

In  words,  "The  sum  of  any  number,  finite  or  infinite,  of  open  sets  is 
open". 

Proof.  Assume  X  C  SC(2:)  and  X  C  OS.  Now  let  re  e  S  and  rr  e  U>^- 
Then  by  rule  C,  a  Q  S.a  e  X.re  e  a.  Hence  a  e  OS.  So  by  the  definition  of 
OS,  (E/3)./3  €  /f(rr)./3  C  a.  So  by  Thm.IX.5.5,  Part  II,  (E/3).,3  €  /f(rc). 
iSC  ux. 

Theorem  IX.8.3.     (a,/3):a,  ^  e  CS.  D  .a\J  ^  e  CS. 

In  words,  "The  sum  of  two  closed  sets  is  closed". 

Proof.  Assume  a  C  S,  /3  C  2,  and  a,/?  e  CS.  Then  -a,-^e  OS  by  the 
definition  of  CS.  So  by  Thm.IX.8.1,  -«  n  -jS  e  OS.  So  -(a  W  iS)  e  OS. 
That  is,  a  W  (S  e  CS. 

Theorem  IX.8.4.     (X):X  C  CS.  D  .flX  «  CS. 

In  words,  "The  intersection  of  any  number,  finite  or  infinite,  of  closed 
sets  is  closed". 

Proof.  Assume  X  C  SC(2)  and  X  C  CS.  Then  {  -a  !  «  e  X}  C  OS.  So 
by  Thm.IX.8.2,  U{-a  |  a  e  X}  e  OS.  So  by  Thm.IX.5.11,  Part  II, 
-nXeOS.    SoflXeCS. 

Theorem  IX.8.5.     {x,a):a  e  H{x).  D  .a  e  OS. 

In  words,  "Any  neighborhood  of  any  point  is  open". 
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Proof.  Rewrite  Axiom  3  in  the  form  (x,a):.a  e  H{x):  D  :{y):y  €  a.  D  . 
(E^).^6/f(2/).^Ca. 

Theorem  IX.8.6.     {oi):a  e  CS.  D  .a   Q  a. 

In  words,  "A  closed  set  contains  all  its  limit  points". 

Proof.  Assume  a  C  2  and  a  e  CS.  Then  —ae  OS.  To  prove  a'  C  a, 
we  assume  x  e  I,  and  x  e  a'  and  prove  x  e  a  hy  reductio  ad  absurdum,  to 
which  end  we  assume  '-^  x  e  a.  Then  x  e  —a.  Since  —ae  OS,  we  have  by- 
rule  C  that  /3  C  2,  /3  €  H(x),  and  0  Q  -a.  Then  by  Axiom  lb  and  Thm. 
IX.8.5,  X  e  /3.)S  e  OS.  So,  since  a:  e  a',  we  have  by  rule  C,y  el,,y  9^  x,  y  e  ^, 
and  2/  c  a.  But  jS  C  —  a,  so  that  y  e  —a,so  that  '^y  ea.  This  is  the  desired 
contradiction. 

Theorem  IX.8.7.     («):«'  C  «.  D  .«  e  CS. 

In  words,  "Any  set  is  closed  if  it  contains  all  its  limit  points". 

Proof.     We  first  prove: 

Lemma,     a  C  2,  x  e  —  a,  (/3):(S  e  H(x).  D  .'^(/3  C  —a)\~xea\ 

Proof.     Assume 


0, 


(1) 

a  C 

2, 

(2) 

a:  €  - 

-a. 

(3) 

(^):/3  « 

;/f(x).  D 

.-(^ 

C   - 

(4) 

7^ 

s, 

(5) 

X    €   7.7 

eOS. 

By  (4), 

(5), 

and  rule  C,  /? 

^  2, /3e 

H(x), 

and 

(6) 

^c 

7- 

Then  by  (3),  ~(|S  Q  —a),  which  is  equivalent  to  (Ey).y  e  13. y  e  a  by  the 
duality  theorem.    So  by  rule  C, 


(7) 

?/6  2, 

(8) 

y  ^0, 

(9) 

y  « OL. 

From 

(2), 

we 

get 

'^  X  e  a, 

so  that  by  (9)  and  Thm. VII.  1.4, 

(10) 

y  ^  X. 

Also  by  (6)  and  (8) 
(11)  yey. 

Then  by  (7),  (10),  (11),  and  (9), 

(E2/).2/  9^  x.y  e  y.y  e  a. 
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Since  this  is  derived  from  (1),  (2),  (3),  (4),  and  (5),  we  infer 

(1),  (2),  (3)f-^€a' 
which  is  our  lemma. 

We  now  prove  the  theorem  by  reductio  ad  absurdum,  to  which  end  we 
assume  a  '^  "Z,  a'  C.  a,  and  '^  a  e  CS.  The  last  gives  '~(  — a  e  OS),  from 
which  we  get  (Ex):x  e  -a:(/3):/?  e  H(x).  D  .~(/3  C  -a).  Then  by  rule  C 
and  our  lemma,  we  get  x  e  —a  and  x  e  a.  That  is,  '~  a:  e  a  and  x  e  a'.  But 
from  X  e  a'  and  a'  C  a,  we  get  x  e  a,  which  is  a  contradiction. 

Corollary  1.     (a) -.a  e  CS.  =  .a'  C  «. 

This  agrees  with  the  definition  of  closed  set  which  is  often  given  and 
which  we  quoted  in  Sec.  9  of  Chapter  VI,  namely:  "A  set  is  said  to  be  closed 
if  it  contains  all  its  limit  points." 

Corollary  2.     (a) -.a  e  CS.  =  .«  =  «. 

In  words,  "A  set  is  closed  if  and  only  if  it  is  identical  with  its  closure". 

By  using  Thm.VI.8.1,  we  could  avoid  the  necessity  of  proving  the  lemma 
preparatory  to  proving  our  theorem 

We  are  now  able  to  prove  that  the  derived  set  of  a  set  is  closed.  However, 
this  result  requires  use  of  Axiom  4,  and  as  we  have  not  yet  used  Axiom  4 
we  shall  postpone  this  result  until  after  we  have  proved  a  few  other  results 
which  can  be  proved  without  Axiom  4. 

Theorem  IX.8.8.     (a,^):a  Q  ^.  D  .a'  C  /3'. 

Proof.     From  a  C  /3,  we  readily  get 

(Ey).y  ^  x.y  e  B.y  e  a:  D  i{^y).y  9^  x.y  e  d.y  e  /8. 

From  this,  we  readily  get 

{e):x  e  e.d  €  OS.  D  .(Ey).y  7^  x.y  t  d.y  e  a.-.  D  :.{e):X  e  B.d  e  OS.  D  .(Ey). 
y  9^  x.y  €  e.y  e  /3. 

Hence  a'  Q  fi'. 

Corollary.    (a,fi):a  Q  P.  D  .a  Q  ^. 

Up  to  now,  we  have  carefully  inserted  all  hypotheses  and  conclusions  of 
the  form  a;  e  S,  a  C  2,  etc.,  which  are  needed  to  take  account  properly  of 
the  fact  that  we  are  using  restricted  variables.  However,  since  the  proofs 
are  getting  rather  long,  we  shall  shorten  them  by  omitting  these.  Strictly 
speaking,  such  omissions  are  not  proper,  but  the  omitted  steps  can  readily 
be  supplied  by  the  reader. 

Theorem  IX.8.9.     (a,/3).(a  W  ,8)'  =  a'  W  /8'. 

In  words,  "The  derivative  of  a  sum  is  the  sum  of  the  derivatives". 

Proof.  By  Thm.IX.8.8,  a'  C  (a  W  ,8)'  and  /3'  C  («  w  )8)'.  So  by 
Thm.IX.4.18,  Part  II, 

(1)  a'  W  iS'  C  (a  W  ^3)'. 
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Now  assume 

(2)  X  €  (a  W  /3)'. 

Casel.  xea.    Then  x  c  a' W  ,8'. 

Case  2.  x  e  ,8'.    Then  a:  e  a'  U  /3'. 

Case  3.  ('^  a:  e  q:')&('^  x  c  /?')•  By  the  duahty  theorem  and  rule  C,  we 
get 

(3)  a:  6  7.7  e  OS 

(4)  {y):y  9^  x.y  e-^.  D  .^  y  e  a 
from  '^  X  €  a'  and 

(5)  a:  e  5.5  €  OS 

(6)  (y):y  9^  x.y  e  8.  D  .^  y  e  ^ 

from  ~  X  e  /3'.    By  (3),  (5),  and  Thm.IX.8.1,  x  t  y  r^  8.y  r\  8  e  OS.    So  by 
rule  C, 

(7)  <t>eH(x), 

(8)  0  c  7  n  5. 

Then  by  Axiom  lb  and  Thm.IX.8.5,  a:  e  0.0  e  OS.  Then  by  rule  C  and  the 
assumption  x  «  (a  W  /3)',  we  get 

(9)  y  9^  x 

(10)  ye<f, 

(11)  y  e  aKJ  ^.       . 

From  (10)  and  (8),  we  get  y  e  y  r\  8,  and  so  y  e  y  and  ?/  e  5.  Then  by 
(4),  (6),  and  (9),  we  get  '^  y  e  a  and  ^  y  t  ^.    So 

(12)  (~  ?/  €  a)&(~  2/  e  ^). 
However,  from  (11),  (?/  e  a)v(?/  «  |8).    This  gives 

(13)  '-'((~  y  e  a)&(~  y  e  /3)). 

By  truth  values,  |-  P'-^P  D  Q.  Taking  P  to  be  (~  y  e  a)&(~  y  e  /3)  and 
Q  to  be  X  c  a'  W  fi',  and  using  (12)^ and  (13),  we  get  x  e  a'  \J  ^'. 

Corollary.     (a,^).a  w  |S  =  5  w  /3. 

Theorem  IX.8.10.     (x,a):.x  e  a:  =  :(/3):a:  €  /3./3  e  OS.  D   .(E^/).?/  e  I3.y  e  a. 

In  words,  "We  say  that  x  is  in  the  closure  of  a  if  every  open  set  containing 
X  contains  at  least  one  element  of  a". 

Proof.     Assume  x  ea.    Then  x  e  avx  e  a'. 
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Case  1.     X  €  a.    Then  the  right  side  follows  easily. 
Case  2.     x  e  a.    Then  the  right  side  again  follows  easily  by  taking  y  to 
be  X. 

Conversely,  assume 

(1)  i^)ix  e  /3./3  e  OS.  D  .{Ey).y  e  ^.y  e  a. 

Case  1.     X  e  ot' .    Then  x  ea. 

Case  2.     '-^  re  e  a'.    Then  by  the  duality  theorem  and  rule  C, 

(2)  a:  €  (8./3  e  OS, 

(3)  {y)'.y  e  |S.2/  e  Q!.  D  .y  =  a:. 

By  (1)  and  (2)  and  rule  C,y  e  I3.y  e  a.  So  by  (3),  ?/  =  a;.  Combining  this 
with  yea  gives  x  e  a.    So  x  ea. 

Corollary.     ix,a):.x  e  a-.  =  :(/3):X  e  /3./3  e  OS.  D  .a  n  ^  5^  A. 

We  now  prove  that  every  open  set  is  a  sum  of  neighborhoods.  That  is, 
if  a  is  an  open  set,  then  there  is  a  X,  each  of  whose  members  is  a  neighbor- 
hood of  some  point,  such  that  a  =  U^- 

Theorem  IX.8.11.  («)::«  e  OS.:  3  :.(EX):.a  =  UX:.(i3):/3  €  X.  D  .(Ex). 
0eHix). 

Proof.     Take  A  to  be  3((Ea^).j8  e  i7(x):/3  C  «).    We  easily  get 

(1)  (i8):/3e^.  D  .(Ex)./3€i7(x) 

(2)  (^):iS  e  A.   D   ./3  e  a. 
By  (2)  and  Thm.IX.5.8,  Part  II, 

(3)  U^  ^  «• 

Now  assume  a  e  OS,  and  x  e  a.  Then  by  rule  C,  iS  e  H(x).p  C  «.  So 
/?  e  A.    As  a:  €  /3  by  Axiom  lb,  we  get  x  e  \JA.    Hence 

(4)  a  e  OS.   D   .a  e  \JA. 
By  (1),  (3),  and  (4),  we  get 

«  6  OS.:  D  :.a  =  U^:.(/3):/3  e  A.  D   .(E.!;)./?  e  i/(x). 

Then  our  theorem  follows  by  Thm.VIII.2.7. 

We  now  prove  some  theorems  using  Axiom  4,  beginning  with  the  theorem 
that  the  derived  set  of  a  set  is  closed. 

Theorem  IX.8.12.     («).«'  e  CS. 

Proof.  By  Thm.IX.8.7,  it  suffices  to  prove  a"  C  a',  and  this  we  do, 
using  essentially  the  same  proof  given  in  Sec.  9  of  Chapter  VI,  only  now 
phrased  in  terms  of  neighborhoods  instead  of  distances.    So  we  assume 

(1)  x  e  a", 
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namely, 

(^):x  6  /3./3  e  OS.  D  .(Ey).y  7^  x.y  e  0.y  €  a', 

(2)  X  e  /3, 

(3)  /S  e  OS. 

Then  by  Axiom  scheme  6,  we  get  (Ey).y  9^  x.y  e  ^.y  e  a',  so  that  by  rule  C 

(4)  y  ^  X, 

(5)  .ye  13, 

(6)  yea', 

namely 

(y)--y  e  7.7  e  OS.  D  .{'Ez).z  9^  y.z  e  y.z  e  a. 

By  (3),  (5),  the  definition  of  OS,  and  rule  C,  we  get 

(7)  0  6  H(y), 

(8)  ./.  g  /3. 

By  (4),  Axiom  4,  and  two  uses  of  rule  C,  we  get 

(9)  e  e  H(x), 

(10)  SeHiij), 

(11)  e  r\  8  =  A. 

As  x  e  0  by  (9)  and  Axiom  lb,  we  have  by  (11), 

(12)  ~xe5. 
By  (7),  (10),  Axiom  2,  and  rule  C, 

(13)  yeH(y), 

(14)  y  Qct>  n  5. 
By  (12)  and  (14), 

(15)  '^  X  €  y. 

By  (13),  we  get  ?/  e  7  by  Axiom  lb,  and  7  e  OS  by  Thm.IX.8.5,  so  that 
by  (6)  and  rule  C 

(16)  z^y, 

(17)  zey, 

(18)  z  €  a. 
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By  (15),  (17),  and  Thm.VII.1.4, 

(19)  Z9^  X, 

and  by  (17),  (14),  and  (8) 

(20)  2  €  j8. 
Thus  by  (19),  (20),  and  (18), 

(21)  (Ez).z  9^  x.z  e  /3.2  e  a. 

So 

(1),  (2),  (3)  h  (21). 
Hence 

(1)  \-  (/3):x  €  /3./S  e  OS.  D  .(Ez).z  9^  x.z  e  fi.z  c  a. 

That  is, 

(l)\-xe  a'. 

Our  theorem  now  follows. 
Corollary,     (a). a"  Q  a'. 

Note  that  the  proof  would  still  go  through  if  we  should  replace  Axiom  4 
by  the  weaker 

4'.  {x,y),x  9^y.D  .(E)S).,8  e  i7(2/).~  x  c  fi. 

Theorem  IX.8.13.     {a). a'  =  a'. 

Proof,  a  -=  a\J  a.  So  by  Thm.IX.8.9,  a'  ^  a  KJ  a".  So  by  Thm. 
IX.8.12,  corollary,  a'  =  a. 

Corollary,     {a). a  e  CS. 

Proof.     Since  a'  =  a  ,  we  have  a'  C  «,  and  can  use  Thm.IX.8.7. 

We  now  show  that  a  is  the  least  closed  set  containing  a. 

Theorem  IX.8.14.     {a,^):a  Q  /3.)S  e  CS.  D  .5  C  /3. 

Proof.  Assume  a  Q  0  and  /3  e  CS.  Then  by  Thm.IX.8.8,  corollary, 
a  C  /3  and  by  Thm.IX.8.7,  Cor.  2,  ,8  =  /3.    So  a  C  /3. 

We  prove  finally  that  a  is  the  product  of  all  closed  sets  containing  a. 

Theorem  IX.8.15.     («).-«  =  Clik^  Q  /3./3  e  CS)). 

Proof.     By  Thm.IX.8.14  and  Thm.IX.5.8,  Part  I, 

(1)  5Cn(3(«C/3./3€CS)). 

Now  clearly  a  Q  a,  so  that  by  Thm.IX.8.13,  corollary,  a  t  ^{a  C  ,8. 
/?  e  CS).    So  by  Thm.IX.5.5,  Part  I, 

n(3(«C/3.^eCS))  C5. 

An  alternative  approach  to  Hausdorff  spaces  is  not  to  take  as  an  unde- 
fined idea  the  idea  of  sets  of  neighborhoods  H{x)  associated  with  each  x 
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but  to  take  as  undefined  a  general  set  of  neighborhoods  A^.     With  this 
approach,  only  "three"  axioms  are  needed,  namely: 

Ob.  X  ^  A. 

1.  IJA^  =  2. 

2.  (a,^,x):a,^  e  A^..r  e  a  r\  (3.  D  .(E7).7  e  N.x  6  7.7  C  «  n  /3. 

3.  ix,y):X  7^  y.  D   .{Ea,0).a,^  e  N.X  €  a.y  €  /3.Q:  Pi  /3  =   A. 

We  can  now  define  H*(x)  to  be  a  (a  e  N.x  e  a).  Then  we  easily  derive  the 
original  set  of  axioms  with  H*(x)  in  place  of  H{x)  as  follows.  Axiom  Oa  of 
the  old  set  follows  immediately  from  the  new  Axiom  1.  From  the  new 
Axiom  1,  we  can  infer,  since  (x)  F(x)  denotes  (u).u  c  2  D  F(u),  {x)(Ea). 
a  c  N.x,  e  a,  whence  the  old  Axiom  la  comes  immediately.  The  old  Axiom 
lb  is  immediate  in  view  of  the  definition  of  H*{x).  Clearly  our  new  Axiom  2 
is  expressly  designed  to  give  the  old  Axiom  2.  With  our  definition  of  H*(x), 
we  can  derive  the  old  Axiom  3  by  taking  /3  to  be  a.  Finally,  the  new  Axiom 
3  is  expressly  designed  to  give  the  old  Axiom  4. 

Conversely,  given  the  old  axioms  with  H(x),  we  could  define  N*  to  be 
d^(Ex).a  €  H{x),  and  prove  the  new  set  of  axioms  with  A^*  in  place  of  A''. 
Hence  we  can  prove  exactly  the  same  theorems  about  neighborhoods,  open 
sets,  etc.,  from  either  set  of  axioms. 

EXERCISES 

IX.8.1.  With  N*  defined  as  indicated  above,  prove  the  new  set  of  axioms 
with  N*  in  place  of  A''. 

IX.8.2.  With  H*{x)  defined  as  indicated  above  in  terms  of  A^  and  with 
OS  defined  by  6l{x):x  ia.D  .(E/3)./3  e  H*{x).^  C  a,  show  that  an  alternative 
form  of  the  new  Axiom  2  is 

{a,0)'.a,^  eN.  D  .a  n  /3  e  OS. 

IX.8.3.     Prove 

(a)  A,2  e  OS. 

(b)  A,S  e  CS. 

A  space  is  said  to  be  "connected"  if  A  and  2  are  the  only  sets  which  are 
both  open  and  closed. 

IX.8.4.  Suppose  we  start  with  the  "four"  axioms  for  H{x)  and  then 
define  H*(x)  to  be  di(x  «  a.a  e  OS).  Prove  the  "four"  axioms  with  H*(x) 
in  place  of  x,  and  also  prove 

(a):.a  e  OS:  =  :{x):X  e  a.  D  .(E/3)./8  e  H*(x).0  C  a. 

IX.8.6.     Two  sets  of  neighborhoods,  A^i  and  A^a,  are  said  to  be  equivalent 
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if  each  satisfies  the  "three"  axioms  indicated  above,  and  in  addition  each 
of  the  two  following  statements  is  valid : 

(I)  {a,x):a  e  N^.x  e  a.  D  .(E/3)./3  e  N^.l^  C  a.x  e  (3. 

(II)  {I3,x):(3  €  N2.X  6  (3.  D  .(Ea).a  e  N^.a  C  I3.x  e  a. 

Prove  that,  if  A'^i  and  N2  are  equivalent  sets  of  neighborhoods,  then  a 
would  be  classified  as  an  open  set  in  terms  of  A^i  if  and  only  if  it  would  be 
classified  as  an  open  set  in  terms  of  Nz- 

IX.8.6.  Let  A^i  =  USC(S)  and  N^  =  SC(2).  Then  prove  that  N,  and 
A^2  are  equivalent  sets  of  neighborhoods  in  the  sense  of  the  preceding  exer- 
cise, and  that  ix).{x}  e  OS. 

IX.8.7.     Prove: 

(a)  (x).{x}  eCS. 

(b)  ia,^):a,^  e  OS.  D   .a  W  /3  e  OS. 

(c)  (a,^):a,l3  6  CS.  D   .a  H  ^  e  CS. 

(d)  {x).nH{x)  =  Ix}. 

IX.8.8.  Let  A^  be  a  set  of  neighborhoods  satisfying  the  "three"  Hausdorff 
axioms,  and  let  a  be  any  nonnull  subset  of  S.    Writing  S*  for  a  and 

.V*  =  3(E7).7  e  iV./3  =  a  n  7, 

prove  that  N*  satisfies  the  "three"  Hausdorff  axioms  with  respect  to  2*. 


CHAPTER  X 
RELATIONS  AND  FUNCTIONS 

1.  The  Axiom  of  Infinity.  Eventually  in  mathematical  reasoning  we 
shall  need  the  axiom  of  infinity.  One  could  put  off  its  use  for  quite  a  while, 
but  if  it  is  introduced  now,  the  theory  of  relations  and  functions  will  be 
much  simplified.  Actually,  we  introduce  an  alternative  postulate  which  is 
equivalent  to  the  axiom  of  infinity  but  which  is  more  particularly  suitable 
to  our  immediate  needs  than  the  more  familiar  forms  of  the  axiom  of 
infinity. 

First  we  introduce  the  definitions 

A  -\-  B  for  {aVJ  (3  \  a  e  A.^  eB.a  n  ^  =  A} 

Nn  for         xm-.iO  e  ^:(y).y  e  0  D  y  -i-  1  e  ^.:  D  :.x  e  (3) 

where  in  the  definition  of  ^  -\-  B,  a  and  (3  are  distinct  variables  which  do 
not  occur  at  all  in  A  or  B. 

A  -\-  B  is  stratified  if  and  only  if  yl  —  B  is  stratified.  Stratification  of 
A  +  B  requires  that  A  and  B  have  the  same  type,  which  will  be  the  type  of 
A  -{-  B  also.  The  occurrences  of  free  variables  in  A  -]-  B  are  exactly  those 
in  each  of  A  and  B.  Nn  is  stratified.  As  it  contains  no  free  variables,  it 
may  be  assigned  any  type.  One  may  even  assign  different  types  to  two 
occurrences  of  Nn  in  the  same  statement. 

It  will  turn  out  that  A  -{-  B  is  the  familiar  notion  of  the  numerical  sum 
of  two  nonnegative  integers  A  and  B,  and  Nn  is  the  class  of  nonnegative 
integers  or  nonnegative  whole  numbers. 

Quine  (see  Quine,  1951)  refers  to  Nn  as  the  set  of  natural  numbers,  and 
thinks  of  the  two  "n's"  in  "Nn"  as  standing  for  "natural  number."  How- 
ever, standard  mathematical  usage  reserves  the  term  "natural  numbers" 
for  the  positive  integers.  Hence  we  shall  have  to  think  of  the  two  "n's"  in 
"Nn"  as  standing  for  "nonnegative." 

**Theorem  X.1.1.     \-  {m,n,a):a  e  m  -\-  n.  =  .(El3,y).l3  e  m.y  e  72../3  r\  y  = 
A./3  W  7  =  a. 

Proof.     Use  Thms.IX.3.1  and  IX.3.2,  Cor.  3. 
**Theorem  X.1.2.     ^{m).0  ^  m  +  1. 

Proof.  Assume  0  =  m  +  1 .  Then  by  Thm. X.1.1  and  [-  A  e  0  and  rule  C, 
we  get  /3  e  w,  7  e  1,  /3  n  7  ==  A,  and  /3  W  7  =  A.  From  7  e  1  by  Thm. 
IX.6.10,  Cor.  5,  and  rule  C,  we  get  y  =  [y].  So  y  e  y.  So  y  e  ^  \J  y. 
So  y  e  A.    This  is  a  contradiction. 

275 
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Let  us  refer  back  to  the  definition  of  Clos(^,P).  In  this,  take  A  to  be 
{0}  and  P  to  be  x  +  1  =  2.    Then  H{A,^,P)  denotes 

{0}   C  ^:.(x,z):X  e  I3.X  +  I  =  Z.  D  .z  e  fi. 

We  note  that  this  is  stratified,  so  that  we  have  [-  2$H{A,P,P). 

Theorem  X.1.3.     |-  Clos({0},  a;  +  1  =  z)  =  Nn. 

Proof.  Since  we  have  \-  5^H{{0},^,x  +  1  =  2)  and  h  a(Nn),  it  follows 
from  Thm.IX.5.12  that  we  need  merely  prove 

h  Hi{0},^,x  +  1  =  z).:  ^  :.0  e  ^:{y).y  e  ^  D  y -\-  1  e  ^. 
However,  by  Thm.IX.6.5, 

^0  6^.  ^  .{0|  C/3. 
Also 

\-  (x,z):X  e  ^.x  +  1  ^  z.  D  .z  e  /3::  =  ::(y,z):.y  +  1  =  z-.  D  :y  e  fi.   D  .z  e  ^:: 
^  ::(y):y  e^.  D  .y  -\-  1  t  ^. 

**Theorem  X.1.4.    |-  0  e  Nn. 

Proof.     Use  Thm.IX.5.13. 
**Theorem  X.1.5.     |-  (n):n  c  Nn.  D  .n  +  1  e  Nn. 

Proof.  By  Thm.IX.5.14,  \-  (x,z):x  e  Nn.x  +  1  =  z.  D  .z  e  Nn.  From 
this,  we  get  |-  (x):x  e  Nn.  D  .a:  +  1  e  Nn. 

Theorem  X.1.6.     \-  (j8)::0  e  I3:{y)--y  e  /?.?/  e  Nn.  D  .y  +  1  e  13.:  D  :.Nn  C  /3. 

Proof.     By  Thm.IX.5.15,  [-  (^)::{0}  C  /3:(x,0):x  e  /3.x  e  Nn.a;  +  1  =  z. 
D  .2  e  iS.:  D  :.Nn  C  /3. 
^Theorem  X.1.7.     [-  (n):.n  e  Nn:  =  :n  =  0.v.(Ew).w  e  Nn.n  =  m  +  1. 

Proo/.  By  Thm.IX.5.16,  \-  {z):.z  e  Nn:  -  :Z  e  {0}.v.(Ea;).a:  e  Nn. 
X  -{-  1  =  z. 

We  now  have  all  the  theorems  dealing  with  +  and  Nn  which  we  need  for 
the  present  chapter.  However,  we  might  as  well  prove  a  few  related  theo- 
rems before  passing  on  to  other  matters. 

*Theorem  X.1.8.     |-  (m).m  =  m  +  0. 

Proof.  Let  a  em.  Then  a  em.A  e  O.a  n  A  =  A.a  \J  A  —  a.  So  (E/3,7). 
i3  e  m.7  e  0./3  n  7  =  A.^S  U  7  =  a.  So  o:  e  m  +  0.  Conversely,  assume 
a  em  -\-  0  and  use  rule  C.  Then  fi  e  m.y  e  0.(3  r\  y  =  A./3  KJ  y  =  a.  So 
7  =   A.     So  /3  W  A  =   a.     So  /3  =  a.     So  a  €  W. 

**Theorem  X.1.9.     [-  {m,n).m  -\-  n  =  n  -\-  m. 

Proof.     Obvious  by  Thm.X.  1 . 1 . 

Theorem  X.1.10.  \-  (m,n,p,a):a  e  (m  -f-  n)  +  p.  =  .(E/3,7, 5) ./3  e  m. 
7  f  n.5  e  p.|8  n  7  =  A.jS  n  5  =  A.7  n  5  =  A.|8  W  7  U  5  =  a. 

Proof.  Assume  a  e  (m  -}-  n)  +  p.  Then  by  Thm.X. LI  and  rule  C, 
e  e  m  -{-  n.d  e  p.e  r\  8  =  A.e  \J  8  =  a.  So  by  Thm.X.1.1  and  rule  C,  /3  e  in. 
y  e  n.0  n  y  =  A.I3  KJ  y  =  6.8  e  p.e  r\  8  =  A.e  KJ  8  =  a.    Then  d  r\  8  ^ 
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(/3  n  6)  W  (7  n  5).  Hence  6  1^  5  =  A.  ^  .^  r\  8  =  A.y  r\  8  =  A.  Hence 
(Efi,y, 8).(3  €  m.y  e  n.5  e  p.(S  M  7  =  A./3  n  5  =  A.7  n  5  =  A./3  U  7  W  6  -  a. 
Conversely,  if  we  assume  (E/3,7,5)./3  e  m.y  t  n.8  e  p.l3  n  7  =  A./3  n  5  =  A. 
7  n  5  =  A./3  W  7  W  5  =  a,  then  by  rule  C,  13  e  m.y  €  n./S  Pi  7  =  A./3  W  7  = 
^^  y.8  e  p.(/3  W  7)  n  5  =  A.()8  \J  y)  VJ  8  =  a.  So  (/3  VJ  7)  e  (w  +  w). 
5  e  p.(/3  W  7)  n  5  =  A.(i3  W  7)  W  5  =  a.    So  a  e  (m  +  n)  +  p. 

Theorem  X.1.11.  |-  (m,n,p,a):a  e  m  +  {n  +  p).  =  .(E/3,7,5)./3  e  m. 
7  6  n.5  e  p./3  n  7  =  A./S  n  5  =  A.7  n  6  =  A./3  U  7  W  5  =  a. 

Proo/.     Similar  to  that  of  Thm.X.  1 .10. 
**Coroll£iry.     [-  (m,n,p).(w  +  w)  +  p  =  m  +  (^  +  ?>)• 

Henceforth  we  shall  commonly  write  m  +  ^  +  P  for  (m  +  n)  +  p.  By 
this  corollary,  m  -\-  n  -{-  p  can  equally  well  denote  m  +  (n  +  p) . 

Theorem  X.1.12.     |-  1  ^  Nn. 

Proof.  By  Thms.X.1.4  and  X.1.5,  [-  0  +  1  e  Nn.  So  by  Thm.X.  1.9, 
I-  1  +  0  e  Nn.    So  by  Thm.X.1.8,  |-  1  e  Nn. 

We  now  prove  the  principle  of  mathematical  induction  that,  if  |-  F(0) 
and  |-  {n):F{n).n  e  Nn.  D  .F{n  +  1),  then  \-  (n):n  e  Nn.  D  .F(n). 

NOTE  THAT  THIS  IS  PROVED  ONLY  WHEN  F{x)  IS  STRATI- 
FIED. 

**Theorem  X.1.13.  Let  P  be  a  stratified  statement.  Then  {Sub  in  P:  0 
for  n},  {n):n  e  Nn.P.  D  .{Sub  in  P:n  +  1  forn}  [-  (n):n  e  Nn.  D  .P. 

Proof.     If  P  is  stratified,  then 

(1)  \-nenP  ^  P, 

(2)  ho  enP  =  {Sub  in  P:Oforn|, 

(3)  h^+  1  ewP  =  {SubinPrn  +  1  forn}. 

However,  by  Thm.X.1.6,  0  enP,  {n):n  e  Nn.n  enP.  D  .n  +  1  enP  [-  Nn  C 
nP.    From  this  our  theorem  follows  by  (1),  (2),  and  (3). 

Note  that  the  presence  or  absence  of  additional  free  variables  besides  n 
in  P  is  quite  immaterial  in  the  preceding  theorem.  Actually,  P  need  not 
even  contain  free  occurrences  of  n,  but  the  theorem  is  quite  trivial  in  this 
case. 

In  referring  to  Thm.X.  1.13,  we  shall  usually  write  F(n)  for  P,  P(0)  for 
{Sub  in  P:  0  for  n},  and  F(n  +  1)  for  {Sub  in  P:  n  +  1  for  n}.  Then 
Thm.X.  1.13  saj^s  that,  if  F(n)  is  stratified,  then 

P(0),  (n):n  e  Nn.P(n).  D  .F(n  +  1)  h  (n):n  e  Nn.  D  .F(n). 

One  should  carefully  distinguish  Thm.X.  1.13  from  the  intuitive  principle 
of  mathematical  induction  which  we  have  used  on  various  occasions.  In 
general  form  they  are  very  similar.  In  each  case  one  wishes  to  prove  a 
statement  about  nonnegative  integers.    However,  in  the  intuitive  case,  our 
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statement  was  about  the  formal  logic,  and  the  integers  involved  are  intui- 
tive numbers,  whereas  in  the  result  above  the  statement  is  within  the  formal 
logic,  and  the  only  thing  that  makes  it  even  seem  to  be  a  statement  about 
numbers  is  that  the  conclusion  of  our  theorem  has  the  additional  hj^pothesis 
n  e  Nn  prefixed  to  the  statement  F(n).  However,  if  we  consider  n  e  Nn  to 
be  the  analogue  within  our  formal  logic  of  the  statement  "n  is  a  nonnegative 
integer,"  then  certainly  Thm.X.1.13  is  the  analogue  within  our  formal 
logic  of  the  familiar  principle  of  proof  by  mathematical  induction. 

Theorem  X.1.14.     \-  (m,n):m,n  e  Nn.  D  .m  +  w  e  Nn. 

Proof.     In  Thm.X.1.13,  take /^(n)  to  be 


Then  F(0)  is 


(m):m  e  Nn.  D  .m  -\-  n  e  Nn. 
{m):m  e  Nn.  D  .m  +  0  e  Nn, 


and  so  we  get 

(1)  h  ^(0) 

by  Thm.X.1.8.  By  Thm.X.1.11,  corollary,  ]r  (m  +  n)  -^  I  =  m  -{-  (n  +  1). 
Also  by  Thm.X.1.5,  \~  m  -\-  n  e  Nn.  D  .(m  +  w)  +  1  e  Nn.  So  ^  w  +  w  e 
Nn.  D  .m  +  (n  +  1)  e  Nn.    So 

\-  (m):m  e  Nn.  D  .m  -{-  n  e  Nn.:  D  :.{in):m  e  Nn.  D  .m  +  (n  -{-  I)  e  Nn. 

That  is 

(2)  ^F(n)  D  F(n+  1). 
Then  by  (1)  and  (2)  and  Thm.X.1.13, 

\-  (n):n  e  Nn.  D  .F(n). 

This  readily  gives  our  theorem. 

Just  in  passing  we  make  the  natural  definitions: 


2 

for 

1  +  1, 

3 

for 

2+1, 

4 

for 

3+1, 

5 

for 

etc. 

4+1, 

Theorem  X.1.15.     h  2  +  2  =  4. 

Proof.  By  definition  |-  3  +  1  =4.  However,  3  is  2  +  1,  so  that  we 
have  [-  (2  +  1)  +  1  =  4.  Then  by  Thm.X.1.11,  corollary,  h  2  +  (1  +  1)  = 
4.    By  the  definition  of  2,  this  is  [-  2  +  2  =  4. 

In  a  similar  manner,  one  could  prove  [-2  +  3  =  5,  |-2  +  4  =  6, 
[-3  +  3  =  6,  etc. 


Sec.  1]  RELATIONS  AND  FUNCTIONS  279 

We  note  that  all  members  of  0,  to  wit  A,  contain  zero  members,  and  all 
classes  which  contain  zero  members,  to  wit  A,  are  members  of  0.  Likewise 
all  members  of  1  contain  one  member  (see  Thm.IX.6.10,  Cor.  5),  and  all 
classes  which  contain  one  member  are  unit  classes  and  hence  are  members 
of  1.  It  will  turn  out  that  all  members  of  2  contain  two  members,  and  all 
classes  which  contain  two  members  are  members  of  2.  A  similar  state  of 
affairs  is  true  for  3,  4,  5,  etc. 
^Theorem  X.1.16.     i-(m,a):a  em-^  1.  =  .(^^,x).l3  em.^x  elS.^yj  {x}  =  a. 

Proof.     Use  Thm.X.1.1,  Thm.IX.6.10,  Cor.  5,  and  Thm.IX.6.6,  Cor.  1. 
^Corollary  1.     \-  (a):a  e  2.  =  .(Ea;,?/)..^  ^  y.{^,y\  =  «• 

Corollary  2.     |-  (a):a  c  3.  =  .(Ex,y,z).x  9^  y.x  9^  z.y  7^  z.[x,y,z}  =  a. 

etc. 

Cor.  1  says  that  2  consists  of  all  classes  with  two  members,  Cor.  2  says 
that  3  consists  of  all  classes  with  three  members,  etc. 

We  now  adjoin  an  axiom  which  is  equivalent  to  the  axiom  of  infinity. 

Axiom  scheme  13.  The  following  statement,  and  each  statement  got 
from  it  by  prefixing  some  set  of  universal  quantifiers,  is  an  axiom: 

(m,n):m,n  e  Nn.w  -{-  1  =  n  -\-  1.  D  .m  =  n. 

The  five  Peano  axioms  for  the  nonnegative  integers  (see  Peano,  1891) 
are  respectively  expressed  by: 

Thm.X.1.4. 

Thm.X.1.5. 

Thm.X.1.2. 

Axiom  scheme  13. 

Thm.X.1.13. 

Actually,  Peano  stated  his  axioms  for  the  positive  integers,  rather  than  for 
the  nonnegative  integers.  However,  his  axioms  are  just  what  the  above  five 
statements  become  if  we  interpret  Nn  as  the  class  of  positive  integers,  and 
replace  0  by  1  throughout.  Hence  it  seems  appropriate  to  refer  to  the  five 
results  cited  as  the  five  Peano  axioms  for  the  nonnegative  integers. 

EXERCISES 

X.1.1.  Prove  |-  (m,n):m  +  n  =  0.  =  jn  =  O.n  =  0. 

X.1.2.  Prove  \-  {7n,n,p):m,n,p  e  Nn.rn  -\-  p  =  n  -{-  p.  D  .m  =  n. 

X.1.3.  Prove  |-  (w):w  e  Nn.  D  .m  9^  m  -\-  I. 

X.1.4.  Define: 

m  <  n  as  (E?>).p  e  Nn.n  =  m  -\-  p. 

m  <  n  as  m  <  n.m  9^  n. 

m  >  n  as  n  <  m. 

m  >  n  SLS  n  <  m. 
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\-  {m).m  <  m. 

\-  {m,n):m  <  n.  =  .m  <  n.y.m  =  n. 

\-  {m):m  e  Nn.  =  .0  <  m. 

[-  {m):m  €  Nn.  =  .m  =  O.v.m  >  1. 

\-  (m,n):m,n  e  Nn.m  <  n.n  <  m.  D  .m  =  n. 

f-  {m,n)'.m,n  e  Nn.m  <  n.  D  .'^(n  <  m). 

\-  {m,n):m  <  n.  D  .(Ep)-P  ^  Nn.n  =  m  +  p  +  1- 

|-  (m;n):.m,n  e  Nn:  D  :m  <  w.  =  .(Ep).p  e  Nn.n  =  m  +  p  +  1. 

|-  (m,n,p):m  <  n.n  <  p.  D  .m  <  p. 

\-  (m,n,p):.m,n,p  e  Nn:  D  :m  <  n.n  <  p.  D  .m  <  p. 

\-  (m,n,p):.m,n,p  e  Nn:  D  :m  <  n.n  <  p.  D  .m  <  p. 

\-  (m,n,p):m  <  n.  D  .m  +  p  <  n  +  p. 

|-  (m,n,p):.m,n,p  e  Nn:  D  .-m  <  n.  D  .w  +  p  <  n  +  p. 

\-  (mi,m2,ni,n2):mi  <  ni.niz  <  n2:  3  :mi  +  W2  <  ni  +  nz. 

|-  (m,n,p):.m,n,p  e  Nn:  D  :W  +  p  <  n  +  p.  D  .m  <  n. 

|-  (m,n,p):.m,n,p  e  Nn:  D  :m  +  p  <  n  +  p.  D  .m  <  n. 

|-  (m,n):.m,n  e  Nn:  D  :w  <  n.  =  .m  +  1  <  ^v 

[-  (m,n):.m,n  e  Nn:  D  :?n  <  n  +  1.  =  .m  <  n. 

f-  (m,n):m,n  e  Nn.  D  .m  <  n.v.m  =  n.v.w  >  n. 

|-  {m,n):.m,n  e  Nn:  D  -.m  <  n.  =  .'^(n  <  m). 

[-  (m,n):.m,n  e  Nn:  D  :to  <  n.  =  .~(n  <  m). 

X.1.5.     Prove  that  A  +  B  is  stratified  if  and  only  if  yl  =  B  is  stratified, 
and  that  if  it  is  stratified  then  A,  B,  and  A  +  •'^  all  have  the  same  type. 
X.1.6.     Prove  that  Nn  is  stratified. 
X.1.7.     Prove: 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 
(g) 
(h) 
(i) 
(J) 
(k) 

(1) 
(m) 

(n) 
(0) 
(P) 
(q) 
(r) 
(s) 
(t) 
(u) 


(a) 

h2, 

:Nn. 

(b) 

h3 
etc. 

eNn 

2.  Ordered  Pairs  and  Triples.  We  shall  now  introduce  the  notion  of 
the  ordered  pair  {x,y)  of  two  objects  x  and  y.  The  notion  is  widel}^  used  in 
mathematics.  In  two-dimensional  analytic  geometry,  points  are  designated 
by  ordered  pairs  of  the  coordinates.  Thus  we  speak  of  the  point  (2,3),  or 
(-7,5),  or  {x,y). 

It  would  be  a  bit  hard  to  say  just  what  the  ordered  pair  (x,y)  of  the  two 
objects  X  and  y  consists  of  from  an  intuitive  point  of  view.  All  that  is 
really  necessary  is  that  it  be  uniquely  determined  by  x  and  y,  and  that  con- 
versely it  shall  uniquely  determine  x  and  y  and  specify  their  order.  Any 
object  which  does  this  can  serve  for  us  as  the  ordered  pair  {x,y). 

We  shall  exhibit  such  an  object  and  use  it  as  the  ordered  pair  {x,y).    The 
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definition  which  we  give  will  seem  extremely  artificial  and  will  certainly 
provoke  the  reaction  that  this  is  definitely  not  what  one  thinks  of  as  the 
ordered  pair  (x,y}.  It  is  not  our  claim  that  what  we  shall  use  as  the  ordered 
pair  (x,y}  is  what  one  would  think  of  intuitively  as  an  ordered  pair.  We 
merely  claim  that  it  does  the  things  that  an  ordered  pair  should  do.  Hence 
we  can  and  will  do  with  it  all  the  things  that  one  could  do  with  a  more  con- 
genial kind  of  ordered  pair.  Moreover,  we  know  no  way  to  construct  a  less 
artificial  ordered  pair,  and  until  someone  shows  how  to  construct  a  less 
artificial  one,  we  shall  use  ours  to  do  all  the  things  that  an  ordered  pair  is 
expected  to  do. 

We  now  introduce  six  definitions,  of  which  the  first  three  are  temporary 
and  apply  only  throughout  the  present  section. 

X  -}-  l.v.'^  X  e  Nn.y  =  x. 


<t>{A) 

for 

y(Kx):X  e  A:X  e  Nn.y 

dM) 

for 

{<t>ix)  \xeA}. 

eM) 

for 

{{01  W0(.r)  \  X  eA}. 

{A,B) 

for 

e,(A)  ^  UB). 

QM) 

for 

:t((f>{x)  eA). 

Q2(A) 

for 

x({0}  KJ<t>(x)  eA). 

In  all  these,  x  and  y  are  distinct  variables  which  do  not  occur  at  all  in  A . 

Clearly  (f)(,A),  6i(A),  62(A),  Qi(A),  and  Q2(A)  are  stratified  if  and  only  if 
A  is  stratified,  and  if  stratified  are  of  the  same  type  as  ^.  (A,B}  is  stratified 
if  and  only  if  Jl  =  B  is  stratified,  and  if  it  is  stratified  it  must  have  the  same 
type  as  A  and  B.  The  free  occurrences  of  variables  in  0(^),  ^i(^),  OiiA), 
{A,B),  Qi{A),  and  QiiA)  are  just  those  in  A  and  B. 

Of  the  three  permanent  definitions,  {A,B)  is  the  ordered  pair  of  A  and  B, 
and  Qi{A)  and  Q2{A)  are  inverses  of  the  ordered  pair  in  the  sense  that 

Q,{{A,B))  =  A 
and 

Q2((A,B))  =  B. 

To  form  (f>(A),  we  replace  each  nonnegative  integer  member  of  A  by  the 
next  larger  integer  and  leave  all  other  members  of  A  unchanged.  Thus 
0(A)  does  not  contain  0.  Hence  {0}  W  (t)(A)  is  distinct  from  (f>(B)  for  each 
A  and  B.  To  form  ^1^4)  we  take  (j>(x)  for  all  x's  in  A,  and  to  form  d2(B)  we 
take  {0}  W  (f>{x)  for  all  x's  in  B.  Clearly  di(A)  and  d2(B)  have  no  members 
in  common  for  each  A  and  B.  By  combining  the  members  of  61(A)  and 
62(B),  we  get  {A,B}.  Those  members  of  (A,B}  which  do  not  contain  a  0 
come  from  A,  and  those  members  of  {A,B}  which  do  contain  a  0  come  from 
B.  Hence,  given  (A,B),  we  can  reconstruct  the  A  and  B  from  which  it  is 
formed,  and  this  is  what  Qi  and  Q2  do. 
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Theorem  X.2.1. 

I.  |-  (y,z):.y  6  <t>(z):  =  :(Ex):X  e  z-.x  e  Nn.i/  =  x  +  l.v.~  x  e  Nn.y  =  x. 
II.  h  (a,x):X  e  di(a).  =  .(Ey).y  e  a.x  =  4>(y). 

III.  h  («,.T):a:  e  ^2(a).  =  .(E?/).?/  e  a.x  =   {0}  \J  4>{y). 

IV.  \-  (x,y,z):z  e  {x,y).  =  .z  e  e^{x)yiz  e  ^zC?/). 
V.   \-  (a,x):X  €  Qi(a).  =  .(f)(x)  e  a. 

VI.  h  (a,.r):a:  e  QaW-  =  .{0}  W  <t>(x)  e  a. 

Proof.     Use  Thm.IX.3.1  and  Thm.IX.3.2,  Cor.  1  and  Cor.  3. 

Theorem  X,2.2.     \-  {x,y):(j>{x)  =  (f){y).  ^  .x  =  y. 

Proof.  Clearly  it  suffices  to  prove  \-  (f>(x)  =  <t>(y).  D  .x  =  y.  Assume 
<t){x)  =  <j){y)  and  z  ex. 

Case  1.  z  e  Nn.  Then  z  -\-  1  e  <p(x).  So  2  +  1  e  </>(?/).  By  rule  C,  w  e  y 
and  w  e  Nn.2  +  1  =  w;  +  l.v.'^  w  e  Nn.2  +  1  =  w.  As  we  have  by  2  e  Nn 
and  Thm.X.1.5  that  2  +  1  e  Nn,  we  have 

'^('^  w  e  Nn.2:  +  1  =  ty). 

So  w  e  Nn.2  +1='m;+1.    As2e  Nn,  we  have  by  Axiom  scheme  13,  z  =  w. 
Hence  z  e  y. 

Case  2.  '^  z  e  Nn.  Then  z  e  (j)(x).  So  2  e  <f)(y).  By  rule  C,  w  e  y  and 
w  e  Nn.2  =  w  -\-  l.v.'~  w  e  Nn.2  =  w.    By  Thm.X.1.5  and  '~  2  e  Nn, 

'^(w  e  Nn.2  =  w  -\-  1). 

So  '^  w  e  Nn.2  =  w.    Hence  z  e  y. 

Analogously,  from  (f){x)  =  4>{y)  and  z  ey,  we  get  z  ex. 

Corollary.     |-  (a,x):(t>(x)  e  di{a).  =  .X  e  a. 

Proof.    Use  Thm.IX.3.7. 

Theorem  X.2.3.     ^(x).^0  e  0(a:). 

Proof.  Use  Thm.X.2.1,  Part  I,  together  with  Thm.X.1.4  and  Thm. 
X.1.2. 

Theorem  X.2.4.     ^  (x,y):\0}  W(/)(x)  =  {0}  yJ4>{y).  =  .x  =  y. 

Proof.  Assume  {0}  \J<t)(x)  =  {0}  yJ4>(y).  Let  ze4>{x).  Then  2  ?^  0  and 
2  e  {0}  \J  (l>{x).  So  2  €  {0}  w  4>{y).  As  2  ?^  0,  2  e  (j)(y).  Conversely,  if 
2  e  0(?/),  then  2  6  (p{x).    So  </>(a;)  =  </>(?/).    So  x  =  ?/. 

Corollary,     h  («,a;):{0}  W0(a;)  e  02(a).  =  .a:  e  a. 

Theorem  X.2.5.     \-  (a,/3):0,(a)  =  0i(/3).  =  .a  =  |8. 

Proof.  Assume  61(a)  =  6i{^)  and  x  e  a.  Then  0(a:)  e  61(a).  So  0(a;)  e 
01  (/3).     Soxe^. 

Theorem  X.2.6.     \-  (a,p)-Ma)  =  02(/3).  =  .a  =  /3. 

Proof.     Similar  to  that  of  Thm.X.2.5. 

Theorem  X.2.7.     \-  (x,y).Q,({x,y))  =  x. 

Proof.     Let  2  e  Q^({x,y)).    Then  0(2)  e  (a:,?/).    So  (i>(z)  e  0i(a:).v.0(2)  e  02(?/). 

Case  1.     0(2)  e  0i(x).    Then  z  ex. 
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Case  2.  (t)(z)  e  Oiiy).  Then  by  rule  C,  w  e  y  and  0(2)  =  {0|  VJ  0(w;). 
So  0  e  <f>(z).  By  Thm.X.2.3,  this  is  a  contradiction.  However,  by  truth 
values  |-  P~P  D  Q.    So  [-  0  e  4>(,z).'^  0  e  0(2).  D  .z  e  x.    So  z  e  x. 

Thus  we  conclude 

\-Q^((x,y))  Qx. 

Conversely,  let  z  ex.    Then  4>(z)  e  0i(a;).    So  <j)(z)  e  (a:,^/).    So  2:  e  Qi((x,y)). 

Theorem  X.2.8.     h  (•i-,2/).^2^(.K,?/))  =  y. 

Proof.     Similar  to  that  of  Thm.X.2.7. 
**Theorem  X.2.9.     |-  (x,y,u,v):{x,y)  =  (u,v).  =  .x  =  u.y  =  v. 

Proof.  Assume  {x,y)  =  {u,v).  Then  Qi({x,y))  =  Q^{{u,v)).  So  x  =  u. 
Similarly  y  =  v. 

As  an  ordered  triple,  {A,B,C),  we  can  take  {(A,B),  C).    Then  we  have; 

Theorem  X.2.10. 

I.  y  (x,y,z).Qi((x,y,z))  =  {x,y). 
II.  y{x,y,z).Q,{Q,{{x,y,z)))  =  x. 

III.  y  {x,y,z).Q,(Q,{{.v,y,z)))  =  y. 

IV.  y  {x,y,z).Q2({x,y,z))  =  z. 

Theorem  X.2.11.     y(x,y,z,u,v,w):(x,y,z)  =  (u,v,w).  =  .x  =  u.y  —  v.z  =  w. 
We  can  define  quadruples,  quintuples,  etc.,  as 

{A,B,C,D)  for        {{A,B,C),D) 

{A,B,C,D,E)        for        {{A,B,C,D),  E) 
etc., 

and  have  similar  theorems. 
We  define 

A  X  B         for         [{x,y)  \x  eA.y  eB] 

where  x  and  y  are  distinct  variables  not  occurring  at  all  in  A  or  B. 

A  X  B\s,  stratified  if  and  only  if  A  =  B  \s  stratified,  and  if  stratified  is  of 
the  same  type  as  A  and  B.  The  free  occurrences  of  variables  m.  A  X  B  are 
just  those  in  A  and  B. 

A  X  B  is  the  class  of  all  ordered  pairs  such  that  the  first  member  is  in  A 
and  the  second  member  is  in  5.    The  class  of  all  ordered  pairs  is  then  V  X  V. 

A  X  B  is  called  the  direct  product  of  A  and  B  or  the  Cartesian  product 
of  A  and  B. 
^Theorem  X.2.12.     y  {a,l3,z):z  €  a  X  iS.  =  .(Ex,y).x  e  a.y  e  ^.z  =  (x,y). 

Corollary  1.     y(z):zeVxy.  =  .{Ex,y).z  =  {x,y). 

Corollary  2.     [-  V  X  V  ?^  A. 

Corollary  3.     K«)-A  X  a  =  a  X  A  =  A. 
♦Theorem  X.2.13.     y  (a,^,x,y):{x,y)  e  a  X  ^.  ^  .x  e  a.y  e  ^. 

Proof.     Use  Thm.IX.3.8. 

We  shall  use  x,  y,  and  z  with  the  understanding  that  they  are  restricted 
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to  the  range  V  X  V.  It  will  be  recalled  that,  in  dealing  with  variables  of 
restricted  range,  it  is  necessary  that  the  range  not  be  A.  This  is  stated  for 
the  range  V  X  V  by  Thm.X.2.12,  Cor.  2,  above. 

Since  V  X  V  is  the  class  of  all  ordered  pairs  (see  Thm.X.2.12,  Cor.  1), 
X,  y,  and  z  serve  as  variables  which  denote  ordered  pairs.  When  they  occur 
free,  our  conventions  about  the  use  of  variables  of  restricted  range  would 
require  explicit  mention  of  the  conditions  xeVxV,  yeVXV,  and 
z  e  V  X  V  which  are  implicit  when  x,  y,  and  z  occur  bound.  However,  we 
shall  often  omit  them  in  those  cases  where  it  is  clear  from  the  context  how 
they  should  be  supplied. 

For  classes  of  ordered  pairs,  we  shall  use  R,  S,  and  T.  Accordingly,  they 
are  variables  restricted  to  the  range  SC(V  X  V).  We  shall  take  similar 
liberties  with  free  occurrences  of  R,  S,  and  T. 

The  sort  of  ordered  pair  which  we  are  using  was  invented  by  Quine  (see 
Quine,  1945).  The  use  of  the  letter  Q  in  Qi(A)  and  QziA)  is  to  signalize 
Quine's  discovery  of  this  means  of  dealing  with  ordered  pairs. 

EXERCISES 

X.2.1.  Show  in  detail  that  the  definition  of  {A,B}  does  satisfy  the  stated 
stratification  requirements. 

X.2.2.  Prove  \-  {a,l3,x,y):{x,y)  e  a  X  /3.  =  .{y,x)  e  /3  X  «. 

X.2.3.  Prove  f-  (a,0>y):{a  n  /3)  X  7  ==  («  X  t)  n  (/3  X  t). 

X.2.4.  Prove  [-  (a,^,y):a  Q  p.  D  .a  X  y  Q  ^  X  y. 

X.2.5.  FroYe\-(x,y):{x}  X  {y}  =  {{x,y)\. 

X.2.6.  Prove: 

(a)  h(«>/3).(«  XO)n(^  X  {V})  =  A. 

(b)  \-  {a,x):^  a;  e  a,  =  .~  {x,A)  e  a  X  0. 

X.2.7.     Prove: 

(a)  \-  ia,n):n  e  Nn.a  en.  D  .a  X  0  en. 

(b)  |-  {a,n):n  e  Nn.a  X  0  e  n.  D  .a  e  n. 
(Hint.     Use  Thm.X.1.13.) 

3.  The  Calculus  of  Relations.  In  mathematics,  a  function  is  a  class  of 
ordered  pairs  {x,y),  the  x's  being  the  arguments  and  the  y's  the  values  corre- 
sponding to  the  a;'s.  Thus  the  function  "square  of"  consists  of  all  ordered 
pairs  {x,x^).  This  property  of  functions  is  the  basis  of  analytic  geometry,  in 
which  we  define  the  graph  of  a  function  to  consist  of  all  points  (x,y)  such 
that  {x,y)  is  one  of  the  ordered  pairs  which  comprise  the  function. 

It  is  not  an  uncommon  practice  to  reject  "many-valued  functions"  and 
not  to  consider  them  as  legitimate  functions.  We  shall  take  this  stand. 
Whenever  we  shall  use  the  word  "function"  we  shall  mean  "single-valued 
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function"  except  for  some  explanatory  remarks  in  the  next  few  paragraphs. 
That  is,  we  consider  a  function  as  a  class  of  ordered  pairs  (x,y)  with  the 
special  property  that,  if  there  is  some  x  and  y  such  that  {x,y)  is  one  of  the 
ordered  pairs  comprising  the  function,  then  there  is  no  z  different  from  y 
such  that  (x,z)  is  also  one  of  the  ordered  pairs  comprising  the  function. 

Although  we  reject  "many-valued  functions"  as  functions,  nevertheless 
we  must  consider  the  type  of  object  which  is  referred  to  as  a  "many-valued 
function."  It  is  certainly  true  that  a  "many-valued  function"  is  a  class  of 
ordered  pairs.  In  analytic  geometry,  one  graphs  "many-valued  functions" 
quite  as  legitimately  as  one  graphs  "single-valued  functions."  The  graphs 
of 

2  2 

y    =  X  , 

x'  -i-y'  =  25, 

y  —  arcsin  x, 

are  perfectly  well  defined  graphs,  even  though  they  are  not  graphs  of 
functions.     One  can  even  "graph"  still  more  general  things,  such  as 

X  <  y. 

The  "graph"  would  consist  of  all  points  above  the  line  x  =  y.  Thus  we 
can  think  of  x  <  ?/  as  defining  a  kind  of  generalized  "many- valued  function" 
in  which  for  any  x  the  values  oif(x)  are  all  y's  greater  than  x.  In  this  case, 
our  "function"  is  still  a  class  of  ordered  pairs,  namely,  the  class  of  all 
ordered  pairs  (x,y)  such  that  x  <  y,  and  its  "graph"  is  drawn  accordingly. 
One  usually  thinks  of  "x  <  y"  a,s  expressing  a  relation  between  x  and  y 
rather  than  defining  a  "many-valued  function"  of  x.  The  exact  distinction 
between  y  being  related  to  x  (as  when  x  <  y)  and  y  being  a  "many-valued 
function"  of  x  (as  when  y^  =  x^  or  x^  +  y^  =  25or  y  =  arcsin  x)  is  not  very 
clear.  Originally  the  distinction  seems  to  have  been  that  the  graph  of  a 
"function"  (whether  single-valued  or  manj^-valued)  should  consist  of  vari- 
ous curves  and  isolated  points,  whereas  the  "graph"  of  a  "relation"  would 
comprise  all  points  in  some  region.    Thus 

x'  -^y'  =  25 

expresses  a  function  because  its  graph  is  the  circumference  of  a  circle, 
which  is  a  perfectly  good  curve.    On  the  other  hand, 

x""  -\-y    <  25 

expresses  a  relation,  because  its  "graph"  consists  of  all  points  in  the  interior 
of  a  circle. 

This  distinction  may  have  been  useful  historically,  but  nowadays  it  is  of 
little  value;  in  fact  the  existence  of  space-filling  curves  makes  it  ambiguous. 
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We  abandon  it,  and  distinguish  merely  between  relations  and  functions 
(by  which  we  mean  "single-valued  functions"). 

A  relation  is  defined  to  be  any  class  of  ordered  pairs  {x,y).  A  function  is 
a  relation  such  that  there  is  exactly  one  y  for  each  x. 

We  do  not  make  the  claim  that  the  notion  of  "many-valued  function" 
can  never  be  of  value.  In  the  theory  of  functions  of  a  complex  variable, 
the  condition  of  analyticity  prevents  the  existence  of  space-filling  curves 
and  other  such  monstrosities  and  permits  one  to  define  the  different 
branches  of  a  function  in  an  unambiguous  sense.  Hence  in  this  theory  one 
can  talk  intelligibly  of  "many-valued  functions"  as  distinct  from  relations, 
and  it  is  indeed  very  useful  to  do  so.  However,  throughout  the  present 
text  it  is  futile  to  try  to  preserve  the  distinction  between  relations  and 
"many-valued  functions,"  and  we  do  not  try. 

We  generalize  the  notions  of  relation  and  function  to  allow  relations 
between  arbitrary  objects  and  functions  of  arbitrary  objects  instead  of 
merely  relations  between  numbers  and  functions  of  numbers. 

We  recall  that  R,  S,  and  T  are  restricted  to  the  range  SC(V  X  V). 
That  is,  R,  S,  and  T  always  denote  subclasses  of  V  X  V.  However  V  X  V  is 
the  class  of  all  ordered  pairs.  Thus  R,  S,  and  T  always  denote  classes  of 
ordered  pairs,  i.e.,  relations.  That  is,  SC(V  X  V)  is  the  class  of  all  rela- 
tions. For  this  reason,  the  abbreviation  Rel  is  often  used  to  denote 
SC(V  X  V). 

We  say  that  x  and  y  stand  in  the  relation  R  if  {x,y)  e  R.  This  is  commonly 
abbreviated  to  xRy.  This  is  analogous  to  the  mathematical  notations 
X  =  y  and  x  <  y  where  the  signs  for  the  relation,  namely,  "  =  "  and  "<" 
are  placed  between  the  objects  which  stand  in  the  given  relation  to  each 
other.    One  often  reads  "xRy"  as  "x  has  the  relation  Rto  y." 

Though  we  shall  be  interested  in  the  notation  xRy  only  in  those  cases 
where  R  is  a  relation,  the  notation  xRy  is  permitted  for  any  R  whatever 
and  will  always  mean  {x,y)  e  R.  When  necessarj^  to  prevent  ambiguity,  we 
shall  enclose  x,  R,  y,  or  xRy  in  parentheses. 

In  considering  the  stratification  of  xRy  it  is  usually  the  case  that  x  and  y 
are  variables  which  do  not  occur  in  the  term  R.  In  such  case  xRy  is  strati- 
fied if  and  only  if  R  is,  and  x  and  y  must  have  the  same  type,  which  must  be 
one  less  than  the  type  oi  R. 

We  recollect  that  a  stratified  statement  F(x)  which  contains  the  free 
variable  x  determines  a  class  xF{x),  namely,  the  class  such  that  x  is  in  this 
class  exactly  whenever  x  makes  F(x)  true.    Thus,  for  stratified  F(x) ,  we  have 

I-  {x).x  e  xF(x)  =  F(x). 

Similarlj^,  we  expect  a  stratified  statement  F(x,y)  with  the  free  variables 
X  and  y  to  determine  a  relation  between  x  and  y,  namely,  the  relation  such 
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that  X  and  y  stand  in  that  relation  exactly  whenever  x  and  y  make  F{x.,y) 
true.  By  analogy  with  the  notation  for  the  class  determined  by  F{x),  we 
denote  the  relation  determined  by  F(x,y)  by  xyF{x,y).  Then  we  expect  to 
have 

|-  (x,y).x{xyF(x,y))y  =  F(x,y). 

We  must  say  more  here  about  the  stratification  requirements.  It  will 
not  suffice  merely  for  F(x,y)  to  be  stratified  in  order  for  it  to  determine  a 
relation  xyF{x,y).  Not  only  will  F(x,y)  have  to  be  stratified,  it  will  have 
to  be  stratified  in  such  a  way  that  x  and  y  have  the  same  type.  This  is 
consistent  with  the  requirement  that  for  xRy  to  be  stratified  x  and  y  must 
have  the  same  type. 

As  a  matter  of  fact,  it  is  a  good  thing  that  in  general  F{x,y)  will  not 
determine  a  relation  xyF{x,y)  unless  F{x,y)  is  stratified  with  x  and  y  having 
the  same  type.    If  x  ==  {y\  should  determine  a  relation 

xy{x  =  \y]), 

then  we  could  derive  the  Russell,  Cantor,  and  Burali-Forti  paradoxes, 
which  would  be  most  unpleasant. 

In  those  cases  in  which  F{x,y)  is  stratified  but  x  and  y  do  not  have  the 
same  type,  one  can  easily  construct  a  substitute  for  the  relation  xyF{x,y). 
For  instance,  suppose  F(x,y)  is  stratified,  but  the  type  of  x  is  two  higher  than 
the  type  of  y.  Choose  a  z  not  occurring  in  F{x,y).  Then  (Ey).F{x,y). 
z  —  WyW'YS,  stratified  with  x  and  z  of  the  same  type.  Hence  it  determines 
a  relation 

x%{{^y).F{x,y).z  =  \{y\]). 

Then  if  we  put 

R  =  xz{{^y).F{x,y).z  =  [\y\\), 


we  shall  have 


Y{x,y).x9.\{y\\  ^  F{x,y). 


For  most  purposes,  this  R  will  serve  as  a  relation  determined  by  F{x,y). 
Among  the  places  where  it  would  not  so  serve  are  in  the  derivations  of  the 
Russell,  Cantor,  and  Burali-Forti  paradoxes. 

The  case  of  any  other  difference  in  the  types  of  x  and  y  can  be  handled 
similarly. 

We  define  x^P  as 

{{x,y)  I  P.x  =  x.y  =  y}. 

Clearly,  if  P  is  stratified  with  x  and  y  having  the  same  type,  then 

(x,y)  =  {x,y).P.x  =  x.y  =  y 

is  stratified,  and  so  by  Thm.IX.3.1,  corollary,  xyP  exists.     Also,  it  will 
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have  type  one  higher  than  the  conunon  type  of  x  and  y.  Thus  the  types 
are  right  so  that 

x{xyP)y  =  P 
will  be  stratified. 

Theorem  X.3.1.     |-  (R)(z):.2  e  R:  =  ■,{Y>x,y).xRy.z  =  (x,y). 

Proof.     Assume 

(1)  R  e  Rel. 

Let  z  eR.  Then  by  (1)  and  the  definition  of  Rel,  z  eY  XY.  So  by  rule  C 
and  Thm.X.2.12,  Cor.  l,z  =  {x,y).    Hence  {x,y)  eR.    That  is,  xRy.    Hence 

(Ex,y).xRy.z  =  (x,y). 

Conversely,  assume  this  and  use  rule  C.  So  z  =  {x,y)  and  xRy.  That  is, 
{x,y)  e  R.    Hence  z  eR. 

Theorem  X.3.2.  If  2;  is  a  variable  which  does  not  occur  in  P,  then 
h  (R):.(2):2  eR.  ^  .(Ea;,7/).P.2  =  (x,y):  ^  :(x,y).xRy  =  P. 

Proof.  Let  us  write  F(x,y)  for  P,  F(u,v)  for  {Sub  in  P:  w  for  x,  vfory}, 
etc.    Assume 

(1)  ReReL  " 
Assume 

(2)  iz):z  eR.  =  .(E^,y)-F(x,y).z  =  {x,y). 

Choose  u  and  v  distinct  from  any  variables  in  the  theorem.  Let  uRv.  That 
is,  {u,v)  eR.  So  by  (2)  and  rule  C,  F{x,y).{u,v)  =  {x,y).  Then  u  =  x  and 
V  =  y.  Hence  F{u,v).  Conversely,  assume  F(u,v).  Then  (Ex,y).F(x,y). 
(u^v)  —  {x,y).  So  by  (2),  {u,v)  e  R,  which  is  uRv.  Thus  we  have  deduced 
{u,v).uRv  =  F{u,v).  By  a  change  of  bound  variables,  (x,y).xRy  =  F(x,y). 
Conversely  assume  this.    Then  by  Thm.VI.5.4, 

(Ex,y).xRy.z  =  {x,y).  =  .{'^x,y).F{x,y).z  =  {x,y). 

NowuseThm.X.3.L 

**Theorem  X.3.3.     \-  (R,S):R  =  S.  =  .{x,y).xRy  =  xSy. 

Proof.     Take  P  to  be  xSy  in  Thm.X.3.2  and  use  Thm.X.3.1. 

Theorem  X.3.4.  If  R  is  a  variable  which  does  not  occur  in  P,  then 
h  {jm{x,y).xRy  ^  P:  D  :^±yP. 

Proof.     Assume  {WC){x,y).xRy  =  P.    Then  by  Thm.X.3.2, 

(ER)(2):2  eR.  ^  .(Ex,y).P.2  =  {x,y). 

So 

(Ea)(2):2  ea.  =  .(Ea;,2/).P.2:  =  <a:,2/>. 

Replacing  P  by  P.a;  =  x.y  =  y,  we  get  MyP. 
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^^^Theorem  X.3.5.  If  P  is  stratified  with  all  free  occurrences  of  x  and  y  of 
the  same  type,  then  |-  'Ri'QP. 

Proof.     Use  Thm.IX.3.1,  corollary. 

Theorem  X.3.6.  If  z  does  not  occur  in  P,  then  |-  3.xyP.:  3  :.(2):2  «  xyP. 
^  .{J^x,y).P.z  =  {x,y). 

Proof.     Use  Thm.IX.3.2,  Cor.  3,  and  the  fact  that 

\-  P:  =  iP.x  =  x.y  =  y. 

*-*-Corollary.     \-  MyP.  D  .iyP  e  Rel. 

**Theorem  X.3.7.     h  3.xyP:  D  :{x,7j).x{xyP)y  =  P. 

Proof.     Use  Thm.X.2.9  and  Thm.IX.3.8. 

Corollary  1.  If  R  is  a  variable  which  does  not  occur  in  P,  then 
\-axyP:  =  :(ER){x,y).xRy  =  P. 

Proof.  To  go  from  left  to  right,  take  R  to  be  xyP.  To  go  from  right  to 
left,  use  Thm.X.3.4. 

Corollary  2.     \-  {R):{x,y).xRy  =  P.  D  .R  =  xyP. 

Proof.  From  (x,y).xRy  =  P,  we  can  get  {x,y).xR.y  =  x(xyP)y  by  the 
theorem,  and  then  we  get  R  =  xyP  by  Thm.X.3.3. 

Corollary  3.     \-  (R).R  =  xy{xRy). 

Theorem  X.3.8.     \-  {x,y).P  =  Q-.  D  dyP  =  xyQ. 

Proof.    Use  Thm.IX.3.3,  Cor.  1. 

Theorem  X.3.9.  If  F{u,v)  is  the  result  of  replacing  all  free  occurrences 
of  x  and  y,  respectively,  in  F{x,y)  by  occurrences  of  u  and  v,  and  F(x,y)  is 
the  result  of  replacing  all  free  occurrences  of  u  and  v,  respectively,  in  F{u,v) 
by  occurrences  of  x  and  y,  then  |-  xyF(x,y)  =  uvF{u,v). 

Proof.     Use  Thm.IX.3.5. 

Theorem  X.3.10.  If  z  is  a  variable  which  does  not  occur  in  F{x,y),  then 
YMyF{x,y).  D  .xyF{x,y)  =  If {Q,{z) ,Q,{z)) . 

Proof.  By  Thm.IX.7.6,  it  suffices  to  prove  j-  (Ex,y).F(x,y).z  =  (x,y). 
=  .2  «  V  X  y.F{Qi(z),Q2{z)).  Assume  the  left  side  and  use  rule  C.  Then 
F{x,y).z  =  (x,y).  So  Qi(z)  =  x  and  Q2(z)  =  y.  So  F(Qi(z),Q2iz)).  Also, 
since  z  =  (x,y),  2  e  V  X  V.  Conversely,  assume  the  right  side  and  use  rule 
C  on  2  e  V  X  V.  Then  z  =  (x,y)  and  F{Q,(z),Q2(z)).  Then  Q^(z)  =  x  and 
Q2(z)  =  y.    Hence  F(x,y). 

Corollary.     \-  («,/?).«  X  ^  =  z(Qi(z)  c  a.Q2{z)  e  ^). 

The  observation  that  a  X  /3  is  xy(x  e  a.y  e  /3)  enables  us  to  prove  the 
existence  of  unstratified  relations.  If  one  takes  a  and  (3  to  be  of  different 
types,  as,  for  instance,  if  we  take  a  to  be  ^(/3  e  y),  then  a  X  13  will  be  un- 
stratified but  will  nevertheless  exist  by  Thm.X.2.12.  It  does  not  appear 
that  one  can  get  into  any  difficulties  by  this  device. 

Since  relations  are  classes,  R  n  S  and  R  W  S  have  a  perfectly  explicit 
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meaning.  So  does  — R,  except  that  we  ask  the  reader  to  recall  that,  because 
of  the  restriction  on  the  range  of  R,  -R  denotes  (V  X  V)  -  R,  the  "V  X  V" 
usually  being  suppressed  for  typographical  convenience.  In  the  case  of 
—  {xyP),  no  such  convention  exists,  and  we  make  none.  Thus  —(zijP) 
denotes  all  objects  not  in  xyP,  and  to  denote  the  class  of  all  ordered  pairs 
not  in  xyP  we  should  have  to  write  (V  X  V)  —  xyP. 

Because  of  our  convention,  — R  always  denotes  the  class  of  all  ordered 
pairs  not  in  R.  That  is,  — R  is  the  complementary  relation  to  R.  To 
denote  the  complementary  class  to  R,  that  is,  the  class  of  all  objects  not  in 
R,  we  would  have  to  resort  to  some  such  expression  as  :r(~  x  eR).  How- 
ever, the  need  for  this  almost  never  arises. 

We  note  a  few  elementary  applications  of  W,  P\,  and  —  as  applied  to 
relations.  If  R  is  "father  of"  and  S  is  "mother  of,"  then  R  W  S  is  "parent 
of."  Also  if  R  and  S  are  <  and  =,  respectively,  then  R  W  S  is  <.  The 
notation  of  n  is  less  commonly  used  for  relations,  but  as  an  example,  if 
R  and  S  are  <  and  > ,  then  R  n  S  is  = .  This  is  the  basis  of  many  proofs 
of  equality,  where  x  ^  y  is  proved  by  proving  both  x  <  y  and  x  >  y. 
Clearly  if  R  is  = ,  then  — R  is  ^ .  Also,  if  we  are  restricting  attention  to 
the  range  of  real  numbers  and  R  is  <,  then  — R  is  >. 

Such  terms  as  R  U  S,  R  n  S,  and  — R  are  not  extensively  used  but  are 
convenient  on  occasion.  Moreover,  such  theorems  as  Thm.IX.4.4,  Thm. 
IX.4.5,  etc.,  hold  for  relations  as  well  as  classes  if  we  put  V  X  V  for  V 
throughout.  Indeed,  the  fact  that  they  hold  for  relations  is  merely  a  special 
case  of  the  fact  that  they  hold  generally  for  variables  over  a  restricted 
range.  By  writing  xRy  for  {x,y)  e  R,  some  of  them  take  novel  forms.  For 
example,  Thms.IX.4.10  through  IX.4.12  take  such  forms  as 

\-  {x,y).x(V  X  Y)y. 

|-  {x,y).^(xAy). 

[-  {x,y)  P:  D  :{x,y).x(V  X  V)?/  =  P. 

^  (x,y)  ~P:  3  :(x,y).xAy  =  P. 

\-  (x,y)  P.  D  MyP. 

\-  ix,y)  -P.  3  .:ixyP. 

\-  (x,y)  P.D.Yxy  =  #P. 

\-  (x,y)  ^P.  D  .A  =  xyP. 

\-  {R):.{x,y).xRy:  ^  :R  =  V  X  V. 

h  {R):.(x,y).^{xRy):  ^  :R  =  A. 

h  (R):.{Ex,y).r^{xRy):  ^  :R  5^  V  X  V. 

\-  {R):.(Ex,y).xRy:  =  :R  ?^  A. 

The  analogues  of  all  theorems  of  Sec.  4  of  Chapter  IX  are  easily  proved 
for  relations,  provided  that  one  puts  V  X  V  for  V.    They  are  then  available 
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for  the  proofs  of  theorems  peculiar  to  the  theory  of  relations,  such  as  the  two 
following  theorems. 
Theorem  X.3.11. 

I.  I-  (a,/3,7).(a  n  /3)  X  7  =   («  X  7)  ^  (^  X  7). 
II.   \-  (a,f3,y).y  X   (a  H  /3)   =   (7  X  a)  ^  (7  X  ^3). 

III.  h  («,^,t).(«  W  /3)   X  7  =   («  X  7)  W  (^  X  7). 

IV.  h  («A7).T  X   («  W  /3)   =   (7  X  a)  W  (7  X  /3). 
Proof  of  Part  I . 

h  x{(a  n  /3)  X  7)2/: 

=  :X  e  {a  r\  /3).?/  e  7: 
^  :X  e  a.X  e  I3.y  e  7: 
=  :a:  e  a.?/  e  y.x  e  ^.y  e  7: 
=  :.T(a  X  y)y.x(l3  X  7)2/: 

^  :.t((q:  X  7)  ^  (/3  X  7))^. 

Then  [-  («  n  /3)  X  7  =  («  X  7)  ^  (^  X  7)  follows  by  Thm.X.3.3. 

Proof  of  Parts  II,  III,  and  IV.     Similar, 

Theorem  X.3.12. 
I.  [-  («,/3,7):a  Q  ^.  ^  .a  X  y  Q  ^  X  y. 
II.  \-  (a,^,y):a  Q  13.  D   .y  X  a  Q  y  X  (3. 

Proof  of  Parti.  Let  a  C /3.  Then  a  =  a  n /?.  So  a  X  7  =  (a  n /3)  X 
7  =  (a  X  7)  ^  (/5  X  7)  by  Thm.X.3.11.    So  a  X  7  ^  /?  X  7. 

Proof  of  Part  II.     Similar. 

Corollary.     \-  (a,l3,y,8):a  Q  I3.y  Q  8.  D  .a  X  y  Q  0  X  8. 

The  following  useful  theorem  for  relations  is  an  analogue  of  the  definition 
of  Q  for  classes. 
♦Theorem  X.3.13.     \-  (R,S):.ix,y).xRy  D  xSy.  =  :R  C  S. 

Proof.  The  implication  from  right  to  left  goes  easily.  Conversely, 
assume  (x,y).x'Ry  D  xSy.  Then  {x,y):x'Ry.z  =  {x,y).  D  .xSy.z  =  {x,y).  So 
by  Thm.VI.7.6,  {^x,y).xRy.z  =  {x,y).  D  .Q^x,y).xSy.z  =  {x,y).  Then  by 
Thm.X.3.1,R  e  S. 

Generalizations  of  the  theorems  of  Sees.  5  and  6  of  Chapter  IX  are  of 
little  use  in  the  theory  of  relations.  However,  a  generalized  form  of  the 
set  of  unit  subclasses  is  quite  useful.  The  members  of  USC(R)  are  the 
unit  classes  of  the  ordered  pairs  {x,y)  which  are  members  of  R.  Hence 
USC(R)  would  not  in  general  be  a  relation,  and  even  if  it  were  a  relation, 
it  would  have  little  connection,  as  a  relation,  with  R.  To  generate  a  relation 
corresponding  to  R,  but  one  type  higher,  we  take  the  class  of  ordered  pairs 
of  the  unit  classes,  { x ]  and  [y],  of  the  constituents,  x  and  y,  of  the  ordered 
pairs  {x,y)  which  are  members  of  R.  This  will  be  denoted  by  RUSC(R). 
So  we  define 

RUSC(A)         for         {{{x],{y])\xAy] 
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where  x  and  y  are  distinct  variables  which  do  not  occur  in  A .    We  also  define 

RUSC'(A)        for        RUSC(RUSC(A)), 
RUSC'(A)        for        RUSC(RUSC'(A)), 
etc. 

RUSC(A)  is  stratified  if  and  only  if  A  is,  and  if  it  is  stratified,  is  one  type 
higher  than  A.  The  free  occurrences  of  variables  in  RUSC(A)  are  just 
those  in  A. 

Theorem  X.3.14.    h  (i2).RUSC(J^)  e  Rel. 

Proof.     Simple. 
♦Theorem  X.3.15.     \-  {R,x,y):x(RUSC{R))y.    =    .(^u,v).uRv.x    =    {u]. 

y  =  [v]- 

Proof.     Use  Thms.IX.3.1  and  IX.3.2,  Cor.  3. 

Corollary.     [- (^).I^USC(i2)  =  xyiEu,v).uRv.x  =  {u}.y  =  [v]. 
♦Theorem  X.3.16.     \-  {R,x,yy.\x]{mJ^C{R))[y].  ^  .xRy. 

Proof.     Use  Thm.IX.3.8. 

Corollary.     \-  (R,x,y):{{x}}(RUSC\R)){{y\}.  =  .xRy. 

Theorem  X.3.17.     \-  (R,S):RUSC(R)  =  RUSC(S).  ^  .R  =  S. 

Proof.  Assume  RUSC(R)  =  RUSC(S).Leta;Rt/. Then  {a:}  (RUSC(R)){2/}. 
So  {x}(RUSC(S)){i/l.  SoxSy.  Hence  by  Thm.X.3.13,  R  C  S.  Similarly 
SCR. 

Theorem  X.3.18. 

I.  h  (R,S).mJ^C{R  nS)  =  mJSC(R)  r\  RUSC(5). 
II.  \.  iR,S).RVSCiR  vjS)  =  RUSC(i2)  W  RUSC(^). 

III.  \-  (R,S).R\JSC{R  -  S)  =  mJSC{R)  -  RUSC(*S). 

IV.  I-  RUSC(A)  =  A. 

V.  I-  (R,S):R  C  S.  ^  .RUSC(R)  C  RUSC(S). 
VI.  h  (R,S):R  C  S.  ^  .RUSC(R)  C  RUSC(S). 

Proof  of  Parti.  Assume  x(RUSC(i2  n  5))?/.  Thenhy  rule  C,  u(R  r\  S)v. 
X  =  {u}.y  =  [v].  So  {u,v)  eR  r\S.  So  {u,v}  e  R  and  {u,v)  e  S.  Hence  uRv 
and  uSv.  So  x(RU^C(R))y  and  x(R[]SCiS))y.  That  is,  {x,y)  e  RUSC(7^) 
and  {x,y)  e  RUSC(>S).  Hence  {x,y)  e  RUSC(/?)  n  RUSC(^).  Thus 
x(RUSC(Z?)  n  RUSC(*S))2/.    Then  by  ThmX.3.13, 

(1)  I-  RUSC(/2  r^S)  Q  RUSC(i2)  n  RUSC(S). 

Conversely,   let  a:(RUSC(/2)    n  RVSC(S))y.     Then  x(RVSC{R))y  and 
x(RUSC(5))2/.    By  rule  C, 

uRv.x  =  {u}.y  —  {v}, 
u'Sv'.x  =  {u'}.y  =  {v'}. 

Hence  u  =  u'  smdv  =  v'.    So  uSv.    So  uiR  n  S)v.    So  a:(RUSC(i2  n  S))y. 
Hence 
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(2)  h  RUSC(72)  n  RUSC(>S)  c  RUSC(/e  n  S). 

Proof  of  Part  11. 

(-  a:(RUSC(/^  w  *S))?/ 

=  '.(Eii,v).u(R  W  »S)y.a:  =  {u}.y  =  {v} 

=  :(Ew,f):w/?v.a:  =   {u}.y  =   {v].v.uSv.x  =   lu}.y  =  {v} 

=  :(EM,«;),wi^t;.a:  =   {u\.y  —   {v}.y.(Eu,v).uSv.x  =   {u].y  =  {v\ 

=  :a;(RUSC(/2))2/.v.a;(RUSC(*S))?/ 

=  :a;(RUSC(/?)  W  RUSC(^))?/. 

So  by  Thm.X.3.3,  we  infer  Part  II. 

Proof  of  Partly.  It  suffices  to  prove  K^,2/)~(^(RUSC(A))?/).  That  is, 
\- (x,y,u,v).'^(uAv.x  =  {u}.y  =^  {v}).    This  is  easily  proved. 

The  proofs  of  Parts  III,  V,  and  VI  are  similar  to  the  proofs  of  Parts  III, 
V,  and  VIof  Thm.IX.6.12. 

Theorem  X.3.19.     h  (a,/3).RUSC(a  X  /S)  =  USC(a)  X  USC^). 

Proof.     ByThm.X.2.13, 

ha;(RUSC(a  X^))y 

=  :(Eu,v).u(a  X  fi)v.x  =  {u}.y  =  {v} 
=  :(Eu,v).u  e  a.v  e  fi.x  =   {u\.y  =   {v] 
=  :(Eu).u  e  a.x  =   {u]:(Ev).v  e  ^.y  =  {v} 
=  '.X  e  USC(a).7/  e  USC(/3) 
=  :a:(USC(«)  X  USC(/3))y. 

CoroUary  1.    \-  RUSC(V  X  V)  =  1  X  1. 
Corollary  2.    Y^^.y)AAi\  X  \){y\. 
Proof.     Put  72  =  V  X  V  in  Thm.X.3.16. 
Theorem  X.3.20. 

I.  Y{R,x,y):x{xy{[x\R{y\))y.  =  .{x\R[y\. 

11.  [-(^,>S):i2CRUSC(>S).  D  .ij/({a;}i2{?/})  C^.i?=RUSC(#({x}7?{i/})). 
Proo/.     Similar  to  the  proof  of  Thm.IX.6.14. 
Corollary  1.     |- (^):^  C  1  X  1.  3  ./2  =  RUSC(xy({x}/2{2/})). 
Corollary  2.    [-  (i2,R):72  e  RUSC(R).  =  .(ES).S  C  R.i2  =  RUSC(S). 

EXERCISES 

X.3.1.     Prove  |-  {R,x,y):x{xy{xRy))y.  =  .xRy. 
X.3.2.     Prove: 

(a)  \-  (R,S):.{x,y).xRy  ^  xSy.  ^  dyi^Ry)  =  m^Sy). 

(b)  \-  {R,S):.{x,y).{x](RVSC(R)){y\  =  {a:KRUSC(5)){y}:  =  : 

RUSC(i2)  =  RUSC(*S). 

(c)  \-  (R).RVSC(R)  =  mJSC(xy(xRy)). 

(d)  h  iR,x,y):x(RVSC\R))y.  =  .(Eu,v) .uRv.x  =  {{w}}.i/  =  {{v}}. 
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X.3.3.     Prove: 

(a)  h  (R).R  n  (V  X  V)  =  xy{xRy). 

(b)  h  (R,S):.(x,y).xRy  ^  xSy.  ^  :R  n  (V  X  Y)  =  S  r\  (Y  X  Y). 

(c)  [-  (R)-^  n  (V  X  V)  e  Rel. 

(d)  ^(R,R).R  nReRel. 

(e)  h  (R,S);R,S  e  Rel.  D  .i?  U  *S  e  Rel. 

4.  Special  Properties  of  Relations.  If  R  is  a  function  and  xRy,  then 
X  is  called  the  argument  and  y  the  corresponding  value.  It  is  customary  to 
write  y  =  f{x)  in  such  a  case.  We  generalize  these  terms,  and  in  general  if 
xRy  then  we  say  that  x  is  an  argument  and  y  is  a  corresponding  value.  The 
set  of  all  .t's  such  that  xRy  for  some  y  is  the  set  of  arguments  of  R,  and  is 
denoted  by  Arg(R).  Likewise,  the  set  of  all  y's  such  that  xRy  for  some  x 
is  the  set  of  all  values  of  R,  and  is  denoted  by  Val(R).    So  we  put 


Arg(A) 

for 

x(Ey).xAy, 

Val(A) 

for 

y(Ex).xAy, 

AV(^) 

for 

Arg(A)  W  Val(A), 

where  x  and  y  are  distinct  variables  not  occurring  in  A .  Arg(^)  and  Val(yl) 
are  stratified  if  and  only  if  A  is,  and  if  stratified  have  the  same  type  as  A. 
Hence  the  same  applies  to  AY  (A).  Free  occurrences  of  variables  are  exactly 
those  in  A. 

Some  logicians  refer  to  Arg(A)  and  Val(A)  as  the  domain  and  converse 
domain  of  A.  We  see  no  reason  for  deviating  from  the  standard  mathe- 
matical terms  "argument"  and  "value."  The  sum  AV(A)  of  the  arguments 
and  values  of  A  is  called  the  field  of  A . 

When  A  is  a  function,  mathematicians  often  refer  to  Arg(A)  as  the  range 
of  A. 

If  one  is  dealing  with  real  variables,  then  if  R  is  the  relation  determined 
byre'  +  y  =  25,  Arg(R)  =  Val(R)  =  AV(R)  -  f(-5  <  a;  <  5);  if  R  is  the 
relation  determined  by  y  =  sin  x,  Val(R)  =  ^(-1  <  ?/  <  1);  if  R  is  the 
relation  determined  by  y^  =  x,  Arg(R)  =  .^(0  <  x). 

Theorem  X.4.1. 
**L  \-  {R,x):x  6  Arg(R).  =  .(Ey)  xRy. 
**IL  h  {R,y):y  e  Val(/2).  =  .(Ea:)  xRy. 
*III.  \-  (R,x):x  eAY{R).  =  .{^y).xRyyyRx. 

Theorem  X.4.2. 

I.  \-  {R,S):R  QS.D  .Arg{R)  Q  Arg(*S). 
II.  \-  {R,S):R  QS.D  .YaXiR)  C  Val(5). 
III.  \-  (R,S):R  QS.D  .AY(R)  C  AY{S). 

Proof  of  Part  I.  Assume  R  Q  S  and  let  x  e  Arg(22).  Then  by  rule  C, 
xRy.    That  is,  {x,y)  e  R.    So  {x,y)  e  S.    That  is,  xSy.    So  x  e  ArgC^S). 
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Proof  of  Part  11.     Similar. 

Proof  of  Part  III.     Use  Thm.IX.4.16,  Cor.  2. 

Theorem  X.4.3. 
*I.  \-  (R):Arg(R)  =  A.  =  .R  -  A. 
II.  i-  (R):Val(R)  =  A.  =  .R  -  A. 

III.  \-  (R):AV(R)  =  A.  =  .R  =  A. 
Theorem  X.4.4. 

I.  \-  {a,l3):(3  9^  A.   D   .Arg(a  X  13)   =  a. 

II.  h  («,/3):a  5^  A.   D   .Ya\(a  X  13)   =  0. 

Corollary. 

I.  h  {cc,(3,y):y  ^  A.a  X  7  =  ^  X  y.  D   .a  =  13. 

II.  \-  (a,0,y):y  9^  A.y  X  a  =  y  X  (3.  D   .a  ^  0. 

III.  [-  (a,^,y):y  ^  A.a  X  y  Q  ^  X  y.  D   .a  Q  (3. 

IV.  \~  i(x,l3,y):y  9^  A.y  X  a  Q  y  X  0.  D   .a  Q  ^. 

To  prove  Parts  III  and  IV,  use  Thm.X.4.2. 

We  now  define  relations  with  restricted  arguments  and  restricted  values. 
We  agree  that  a^R  shall  denote  R  with  its  arguments  restricted  to  lie  in  a, 
Rf/3  shall  denote  R  with  its  values  restricted  to  lie  in  /3,  and  a]R\^  shall 
denote  R  with  its  arguments  restricted  to  a  and  its  values  restricted  to  /3. 
We  define 


A]C 

for 

(AXY)  r\C 

C\B 

for 

(V  XB)  nC 

A]C\B 

for 

(A  X  B)  n  C. 

A]C  is  stratified  if  and  only  if  ^  =  C  is  stratified.  If  A]C  is  stratified, 
it  has  the  same  type  as  A  and  C.  The  free  occurrences  of  variables  in  A]C 
are  just  those  in  A  and  C.    Similarly  for  C\B  and  A]C\B. 

If  R  is  the  relation  determined  by  x  —  sin  y,  ihenR\{y(  —  T/2  <  y  <  7r/2)) 
is  the  function  determined  by  y  =  arcsin  x,  using  only  the  principal  value 
of  the  arcsin.  If  R  is  the  relation  determined  by  y^  =  x,  then  Rt(^(0  <  y)) 
is  the  function  determined  hy  y  —  -\r  -s/x. 

Theorem  X.4.5. 

^I.  |-  {a,R,x,y)\x{a\R)y.  =  .x  e  a.xRy. 

II.  h  (l3,R,x,y):x(R\^)y.  ^  .xRy.y  e  13. 
*III.  \-  {a,^,R,x,y):x{a]R\j3)y,  =  .X  e  a.xRy.y  e  ^. 

Proof  of  Part  I. 

|-  x(a]R)y.  =  .x(a  X  Y)y.xRy 
.  =  .X  e  a.y  e  Y.xRy 
.  =  ,x  e  a.xRy. 

Proof  of  Parts  II  and  III.     Similar. 
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Theorem  X.4.6. 

I.  1-  ia,R).a]R  C  R. 

II.  \-  (^,R).R\^  c  R. 

III.  [-  {a,0,R).a]R\^  C  R. 

Proof.  Use  Thm.IX.4.13,  Cor.  6,  with  the  definitions  of  a]R,  R\fi,  and 
a]R\0. 

Theorem  X.4.7. 

I.  \-  {a,R).(x]R  e  Rel. 
II.  \-  W,R).R\^  e  Rel. 
III.  h  ia,^,R).aW\^  t  Rel. 

Proof.     Use  Ex.  X.3.3,  part  (d). 

Theorem  X.4.8. 

I.   h  {a,^,R).(c']R)\0  =  a]R\p. 
II.   \-  {a,0,R).a]{RW)   =  cc]R\^. 
Proof  of  Part  I .     By  Thm.X.4.5, 

[-xaa]Rmy.  ^  .x{a]R)y.ye^ 

.  =  .a;  e  a.xRy.y  e  /8. 

Proof  of  Part  II.     Similar. 
Theorem  X.4.9. 

*I.  \-  ia,R).Arg{a]R)  =  a  r\  ATg(R). 
II.  \-  (^,R).Ya\iR\^)  =  /3  n  Val(/2). 

Proof  of  Part  I.  Let  x  e  kvg{a]R).  Then  x{a]R)y.  So  x  «  a.xRy.  So 
re  e  a.a;  e  Arg(72).    The  converse  proceeds  readily. 

Proof  of  Part  II.     Similar. 

Theorem  X.4.10. 

I.  \-  (a,R):Arg(R)  C  a.  =  .R  =  a^R. 

II.  h  (^,R):Val(R)  ^  iS.  =  .R  =  Rt/3. 

P/-00/  0/ Part  I.  If  R  =  alR,  then  by  Thm.X.4.9,  Arg(R)  =  a  r\  Arg(R), 
so  that  Arg(R)  C  a.  Conversely,  let  Arg(R)  C  a,  and  let  xRy.  Then 
X  e  Arg(R).    So  a:  e  a.    So  a:(alR)!/.    So  R  C  a^R.    However,  alR  C  R. 

Proof  of  Part  II.     Similar. 

Corollary. 

I.  h(R).R  =  Arg(R)lR. 

II.  h  (R).R  =  RfVal(R). 

III.  KR).R  =  Arg(R)1RfVal(R). 
Theorem  X.4.11. 

1.  h  (oc,0,R).(c^  n  ^)]R  =  (a]R)  n  (p]R). 
II.  h  (a,i3,i2)./?t(«  n  ^)  =  (i^r«)  n  (i2r/3). 

III.  h  («,^,i2).(a  W  /3)1i2  =  (a]R)  W  (/3li2). 
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V.  h  («,/3,i2).(«  n  ^)l/2  =  aMR). 
VI.  \-(cx,fi,R).RKoin^)  =  (R\a)\l3. 

Proof.    Use  the  definitions  of  a]R  and  R\fi  with  Thm.X.3.11. 
Theorem  X.4.12. 

I.  [-  (a,R,S):R  QS.D  .a]R  Q  a]S. 
II.  [-  (0,R,S):R  QS.D  .R\^  Q  S\^. 

III.  I-  (a,^,R,S):R  QS.   D   .a]R\^  Q  a]S\fi. 

IV.  h  {a,^,R):<x  Q^.  D  .a]R  Q  (3]R. 
V.  \-  (ci,^,R):(x  Q^.  D  .R\a  C  R\^. 

Proof.  Use  Thm.IX.4.16,  Part  I,  and  for  Parts  IV  and  V,  use  Thm. 
X.3.12  in  addition. 

We  define  the  converse  of  a  relation  R  as  the  relation  which  holds  between 
X  and  y  when  yRx.  Thus  >  is  the  converse  of  < ,  arcsine  is  the  converse  of 
sine,  square  root  is  the  converse  of  square,  the  logarithm  is  the  converse  of 
the  exponential,  and  in  general  the  converse  of  a  function  is  just  the  inverse 
function.  We  use  two  different  notations  for  the  converse  of  R,  namely, 
Cnv(R)  or  R.    So  we  define 

Cnv(-4)        for        xy{yAx), 
A  for        xy(yAx), 

where  x  and  y  are  distinct  variables  not  occurring  in  A .  Ctiv(A)  is  stratified 
if  and  only  if  A  is,  and  if  stratified  has  the  same  tj^e  as  A.  Free  occur- 
rences of  variables  are  exactly  those  in  A. 

Theorem  X.4.13. 
*I.  |-  {R,x,y)\xRy.  =  .yRx. 
II.  h  {R).'R  e  Rel. 

Corollary  1.     \-  (R,x,y):x{Cnv(R))y.  =  .xRy. 

Corollary  2.     \-  (R).Cnv(R)  =  R. 

Corollary  3.     h  (R,S):R  =  S.  =  .R  =  S. 

Corollary  4.     \-  (a,^).Cnv(a  X  ^)  =  ^  X  a. 

Corollary  6.     \-  (R,S):R  C  S.  =  .R  C  S. 

Theorem  X.4.14.     \-  (R,S).Cny{R  n  S)  =  Cnv(/2)  n  Cny(S). 

Proof. 

\-  x{Cnv{R  n  S))y.  ^  .y{R  n  S)x 
.  =  .yRx.ySx 
.  =  .xRy.xSy 
.  =  .x{k  r\  S)y. 

Corollary  1.     \-  {a,R).Cnv{a]R)  =  Tl\a. 

Corollary  2.     \- {^,R).Cnv{R\0)  =  ^]tl. 

Corollary  3.     \-{a,^,R).CnY{a]R\^)  =  ^]R\a. 

Theorem  X.4.15.     |-  {R,S).Cn\r{R  \J  S)  -=  Cnv(R)  W  Cny(S). 

Proof     Similar  to  that  of  Thm.X.4. 14. 
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Theorem  X.4.16. 

*I.  h  (R).ArgiR)  =  Val(J?). 
*IL  [-  (R).Ysi\(R)  =  Arg(R). 

Proof  of  Part  I.     Let  re  e  Arg(^).    Then  xRy.    So  yRx.    So  x  e  Val(i2). 
Conversely,  if  x  e  Ya\(R),  then  x  e  Arg(R). 
Proof  of  Part  II.     Similar. 

We  define  the  relative  product  R\S  of  R  and  S  as  the  relation  which 
holds  between  x  and  z  when  there  is  a  y  such  that  xRy.ySz.  That  is,  we 
define 

A\B        for        xz(Ey).xAy.yBz, 

where  a:,  ?/,  and  2  are  distinct  variables  not  occurring  in  either  A  or  B.  A\B 
is  stratified  if  and  only  HA  =  B  is  stratified,  and  if  stratified  has  the  same 
type  as  A  and  B.  The  free  occurrences  of  variables  are  just  those  in  A 
and  B. 

In  mathematics,  if  we  have  two  functions  /  and  g,  then  the  relation 
z  =  f(g(x))  defines  the  function  g\f.  As  a  transformation  is  just  a  function, 
we  see  that  if  R  and  S  are  two  transformations,  then  R|S  is  the  product  RS 
of  R  and  S  in  the  usual  sense  of  the  product  of  two  transformations,  since 
to  form  the  product  RS,  we  first  apply  R  and  then  apply  S.  Thus  the 
associative  law  of  multiplication  for  transformations  is  a  special  use  of 
Thm.X.4.18  below. 

Incidentally,  if  R  is  a  transformation  with  an  inverse,  then  R  is  that 
inverse.  Hence  the  corollary  to  Thm.X.4.17  below  gives  as  a  special  case 
the  familiar  result, 

(RS)-^  =  (S-^)(R-^), 

that  the  inverse  of  a  product  is  the  product  of  the  inverses  in  reverse  order. 

If  a:  =  f(t)  and  y  =  g(t)  are  the  parametric  equations  of  a  curve,  then  the 
relation  between  x  and  y  which  has  this  same  curve  for  a  graph  is  f\g. 
That  is,  the  single  equation  y  =  (f\g){x)  is  equivalent  to  the  two  para- 
metric equations  x  =  f{t)  and  y  =  g(t). 

There  is  a  striking  similarity  between  the  notation  {A  \  P}  for  the  class 
of  all  A's  such  that  P  and  the  notation  {R\S\  for  the  unit  class  of  the  rela- 
tive product  R\S.  Theoretically,  confusion  between  the  two  is  impossible, 
since  in  the  first  case  P  must  be  a  statement  and  in  the  second  case  S  must 
be  a  term.  Also  in  the  first  case  we  customarily  leave  a  space  on  each  side 
of  the  I  and  in  the  second  case  we  do  not.  Also  the  second  case  is  not  used 
any^vhere  that  we  know  of.  However,  to  render  confusion  impossible,  we 
shall  agree  that  in  the  second  case  we  shall  always  enclose  R\S  in  paren- 
theses before  enclosing  it  in  braces,  so  that  the  second  case  will  always  be 
written  {(jR|*S)}.  As  the  first  case  could  not  conceivably  be  written 
{(^|P)},  we  now  have  a  unique  determination  of  the  notation. 
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Theorem  X.4.17. 

*I.  h  (R,S,x,z):x(R\S)z.  =  .(^y).xRy.ySz. 
II.  [-  {R,S).R\S  e  Rel. 
**Corollary.     |-  iR,S).CnY{R\S)  =  S\R. 
**Theorem  X.4.18.     [-  {R,S,T).R\{S\T}  =  (R\S)\T. 
Proof. 

[- xiR\{S\T))y.  s  .(Eu).xRu.u(S\T)y 

.  =  .(Eu,v)  .xRu.uSv.vTy 

.  =  .(Ev,u).xRu.uSv.vTy 

.  =  .('^v).xiR\S)v.vTy 

.  ^  .x(iR\S)\T)y. 

Let  721^1  T  denote  either  of  (R\S)\T  or  R\{S\T). 
Theorem  X.4.19. 

I.  h  (R,S).Avg(R\S)  =  Arg(R\Avg(S)). 
II.  h  (R,S).Ya\iR\S)  =  Ysd(Ya\{R)]S). 
Proof  of  Part  I. 

\-xeATgiR\S) 

=  .{Ez).x(R\S)z 
=  .(Ez,y).xRy.ySz 
=  .(Ey).xRy.{Ez).ySz 
=  .iEy).xRy.yeArg{S) 

^  .{Ey).x(RU^ys(S))y 

^  .xeATg{R\Avg{S)). 

Proof  of  Part  II.     Similar. 
Corollary. 

I.  h  (R,S).ArgiR\S)  C  Axg{R). 
m.  h  {R,S).Ysi\iR\S)  C  Ya\(S). 

*III.  \-  (R,S):Ya\(R)  cz  Arg{S).  D  .Arg(R\S)  =  Arg(R). 
IV.  [-  (/2,^):Arg(*S)  C  Ya\{R).  D  .Ya\iR\S)  -  Val(*S). 
In  Parts  I  and  II  use  Thm.X.4.6  and  Thm.X.4.2,  and  in  Parts  III  and  IV 
useThm.X.4.10. 

Theorem  X.4.20. 
I.  h  (a,R,S).(a]R)\S  =  a](R\S). 
II.  y  W,R,S).R\{S\^)  =  {R\SM 
Proof  of  Part  I. 

[- x({a]R)\S)z 

^  .iEy).x{a]R)y.ySz 
=  .(Ey).x  e  a.xRy.ySz 
=  .X  e  a.{Ey).xRy.ySz 
=  .X  e  a.x{R\S)z 
=  .x{a]{R\S))z. 
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Proof  of  Part  II.     Similar. 

Corollary.     [- {a,^,R,S).{a\R)\{Sm  =  a]{R\S)\^. 

Theorem  X.4.21.     \-  {a,R,S).{R\a)\S  =  R\{a]S). 

Proof.     Similar  to  that  of  Thm.X.4.20. 

The  notion  YQX{a\R)  is  very  important.  In  terms  of  the  "graph"  of  R, 
it  may  be  thought  of  as  the  projection  on  the  ?/-axis  of  those  points  whose 
x-coordinate  is  in  a.  Although  we  have  already  available  one  notation  for 
this  notion,  there  is  another  notation  in  common  use,  namely, 

A"B        for        \q\{B\A). 

The  stratification  conditions  for  A''B  are  the  same  as  those  for  YsMB\A), 
namely,  that  A  =  B  must  be  stratified,  and  if  stratified  A'*B  has  the  same 
type  as  A  and  B.  The  free  occurrences  of  variables  are  just  those  in  A 
and  B. 

Whenever  R"a  occurs  as  part  of  a  formula  we  shall  give  the  smallest 
possible  scope  to  *'.    Thus 

R''a  W  /3  means  (i?"a)  W  /3  rather  than  R'\a  \J  /3), 
a  \J  R"^  means  a  W  (R**^)  rather  than  (a  \J  R)"^, 
etc. 

In  the  case  of  S'*R''a,  the  convention  about  the  scope  of  "  is  ambiguous. 
We  agree  that  S"R"a  stands  for  S"{R"a),  rather  than  for  (*S"E)"«. 

If  i?  is  a  transformation,  then  R"a  is  the  region  that  a  is  mapped  into  by 
R.  Hence  the  notion  R"a  is  important  in  transformation  theory.  If  we 
start  with  a  region  a  and  apply  successively  the  transformations  R  and  S, 
we  transform  a  into  ^"(■^"")- 

We  refer  to  R"a  as  the  map  of  a  or  projection  of  a  by  R. 
Logicians  usually  define  R"a  as  ATg(R\a).    In  our  notation,  this  would 
be  R"a.    Thus  our  theorems  about  R**a  will  differ  from  the  usual  ones  by 
having  R  replaced  by  R. 

The  reason  for  our  variance  from  the  usual  convention  is  that,  in  the 
formula  {x,y)  e  R,  we  think  of  x  as  the  argument  and  y  as  the  value,  in 
accordance  with  the  usual  analjd^ical  geometry  convention,  whereas  most 
logicians  think  of  y  as  the  argument  and  x  as  the  value  in  xRy. 
♦Theorem  X.4.22.     h  (ci,R,y):y  e  R"a.  =  .(E'X).xRy.x  e  a. 
♦Corollary.     \-  (R,x,y):y  e  R"{x}.  =  .xRy. 
Theorem  X.4.23.     \-  i^,R).R''l3  =  AYg{R\^). 
Proof. 

\-  ^"/3  =  Val(/31^) 

=  Val(Cnv(i^t/3)) 
=  Arg{Rm. 

Corollary.     \-  {l3,R,x):x  e  R"^.  =  .{^y).xRy.y  e  /3. 
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Theorem  X.4.24. 

I.  h  (a,R).R"a  C  Ya\{R). 
II.  h  (^,R).R"^  C  Arg(i^). 

Proof  of  Part  I.  By  Thm.X.4.6,  a]R  Q  R.  So  by  Thm.X.4.2, 
Yal{a]R)  C  Ya\{R). 

Proof  of  Part  II.     Similar. 

Theorem  X.4.25.     \-  {a,^,R):a  Q  13.  D  Jf'a  C  R"i3. 

Proof.     Use  Thm.X.4.12,  Part  IV,  and  Thm.X.4.2. 

Theorem  X.4.26. 
I.  h  (R).ya^(R)  =  R"Arg(R). 
II.  \-  iR).ATg(R)  =  R"Ysi\iR). 

Proof  of  Part  I.  Let  y  e  YaXiR).  Then  xRy.  So  xRy.x  e  ATg(R).  So 
y  e  R"Arg(R).    The  converse  goes  easily. 

Proof  of  Part  II.     Similar. 

Theorem  X.4.27. 
I.  h  (a,R).R"cc  =  R'\a  r^  Arg(R)). 
II.  \-  (^,R).R"I3  =  R"(f3  r\  Val(72)). 

Proof  of  Part  I.  By  Thm.X.4.9,  Part  I,  \-  Arg(a]R)  C  Arg(R).  So  by 
Thm.X.4.10,  Part  I,  \-  a]R  =  AYg(R)](a]R) .  So  by  Thm.X.4.11,  Part  V, 
\-  a]R  =  (ar\  ATg{R))]R.    So  \-  Y&\{a]R)  =  Val((a  n  Arg(i2))1/2). 

Proof  o/  Part  II.     Similar. 

Corollary. 
I.  h  (/2)./2"V  =  Val(i^). 
II.  h  {R).R"Y  =  Arg(J?). 

Theorem  X.4.28.     [-  (a,i2,,S).(7^|*S)"o!  =  ^"i2"a. 

Proof.  By  Thm.X.4.20,  Part  I,  \-  a](R\S)  =  {a]R)\S.  So  \-  (R\Sy*a  = 
Val((a1/2)|5).    But  by  Thm.X.4.19,  Part  II, |-  Ya\(ia]R)\S)  -  S"Y&\{a]R) 

This  theorem  has  an  interesting  interpretation  in  terms  of  transformation 
theory.  As  we  noted  earlier,  R\S  is  the  product  of  the  transformations  R 
and  S  (in  the  sense  of  applying  first  R,  then  S) .  This  theorem  says  that  we 
get  the  same  map  whether  we  apply  R  and  S  successively,  getting  S"R"a, 
or  whether  we  apply  the  product  transformation  directly,  getting  (R\Sy'a. 

Usually  in  mathematics,  transformations  are  functions,  so  that  we  would 
usually  be  interested  in  the  special  cases  of  Thm.X.4.17,  corollary,  Thm. 
X.4.18,  Thm.X.4.19,  Thm.X.4.20,  and  Thm.X.4.28  which  result  when  we 
take  R  and  S  to  be  functions.  However,  in  algebraic  geometry,  use  is 
made  of  many-valued  mappings  called  correspondences,  and  there  the  full 
strength  of  these  theorems  is  useful.  Note  that  the  product  of  two  corre- 
spondences R  and  S  is  exactly  R|S  (see  Coolidge,  1931,  page  125). 

Theorem  X.4.29. 

I.  h  {R).Avg(R\JSC{R))  =  USC(Arg(jR)). 
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II.  h  (i2).Val(RUSC(/?))  =  USC(Val(i2)). 
III.  h  (R).AY(RVSC(R))  -  USC(AV(/^)). 

Proof  of  Part  I.  Let  x  e  Arg(RUSC(i?)).  Then  by  rule  C,  a;(RUSC(i2))2/. 
Then  by  Thm.X.3.15  and  rule  C,  uRv.x  =  {u}.y  =  {v}.  So  w  e  Arg(-R). 
.r  =  {m}.  So  by  Thm.IX.6.10,  Part  III,  x  e  USC(Arg(i2)).  Conversely, 
leta:  eUSC(Arg(i?)).  Then  w  e  Arg(i2).a:  =  {w}.  SouRv.x={u}.  Hence 
wi22;.a:  =  {w}.{W  =  {W-    So  a;(RUSC(/2)){y}.    So  a:  e  Arg(RUSC(/2)). 

Proof  of  Part  11.     Similar. 

Theorem  X.4.30. 
I.  h  (a,R).RVSC(a]R)  =  USC(«)1RUSC(i2). 
II.  h  (/3,i2).RUSC(/2t^)  =  RUSC(/?)fUSC(j8). 

Proof  of  Part  I . 

h  a:(RUSC(«1/2))i/ 

=  :(Ew,z;).w(a:1/2)?;.a:  =   {u}.y  =  {v} 

=  :(E'U,i').w  e  a.uRv.x  =    {w}.?/  =   {w} 

=  :(Ew,w).{'u}  e  USC(Q:).w^y.rc  =  {u].y  =  [v] 

=  :(Ew,t;).a;  e\]^C{a).uRv.x  —  {u].y  =  {v} 

=  -.x  e'USC(a):{Eu,v).uRv.x  =  {u}.y  =  {v} 

=  :a;  6  VSC(a).x(R\JSCiR))y 

=  :x(USC{a)]mJSC{R))y. 

Proof  of  Part  II.     Similar. 

Corollary,     h  (/3,i?).(RUSC(/2))"USC(/3)  =  USC(i2"/3). 

Theorem  X.4.31.     \-  (i2).RUSC(^)  =  Cnv(RUSC(22)). 

Proof. 

\-  x(RVSC(R))y 

.  =  .(Eu,v).uRv.x  =  {u}.y  =  {v} 
.  =  .(Eu,v).vRu.y  =  {v\.x  =  {u} 
.  =  .y(RVSC{R))x 
.  =  .x(Cnv(RUSC(i2)))2/. 

Theorem  X.4.32.    \^  {R,S).'RUSC(R\S)  =  RUSC(/2)|RUSC(;S). 

Proof.  Assume  x{mJSC(R\S))z.  Then  by  rule  C,  w(72|^)w;.a;  =  {w}. 
z=  {w}.  Thenhy  rule  C,  uRv.vSw.x  =  {u}.z=  {w}.  So  x(mjSC{R)){v}. 
{v}{mJSC(R))z.    So  x(RUSC(i2)|RUSC(*S))0. 

Conversely,  assume  a;(RUSC(i2)|RUSC(AS))2.  Then  by  mle  C, 
x(RUSC{R))y.y(RV^C(S))z.  So  by  rule  C,  uRv.x  =  {u}.y  =  {v}.wSt. 
y  =  {w}.z  =  {t].  So  w  =  V.  So  uRv.vSt.x  =  {u}.z  =  {t}.  So  u(R\S)t. 
x  =  {u].z  =  [t].    So  x{mJSC{R\S))z. 

An  important  notion  is  the  closure  of  a  class  a.  with  respect  to  a  function 
/.  Intuitively,  we  get  this  as  follows.  We  start  with  the  members  of  a.  For 
each  X  which  is  a  member  of  a,  we  form /(re).  We  enlarge  a  by  adding  each 
of  these /(a;) 's  to  it  as  a  new  member.  We  now  repeat  this  process  with  the 
enlarged  a,  getting  a  further  enlarged  a.    We  reiterate  this  process  (an 
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infinite  number  of  times  if  necessary)  until  we  arrive  at  a  class  13  that  cannot 
be  further  enlarged  by  this  process.  Such  a  class  (3  is  said  to  be  "closed 
with  respect  to/"  and  is  called  the  closure  of  a  with  respect  to/. 

More  generally,  we  can  form  the  closure  of  a  class  a  with  respect  to  a 
relation  R.  If  we  think  of  72  as  a  many-valued  function,  the  procedure  is 
quite  analogous.  Instead  of  adding/(a;)  to  a,  we  add  all  members  oi  R"{x}. 
As  soon  as  we  get  to  a  ;8  which  cannot  be  enlarged  by  this  procedure,  we 
stop.  We  say  that  ^  is  closed  with  respect  to  R  and  is  the  closure  of  a 
with  respect  to  R. 

Another  way  to  imagine  forming  (3  is  to  add  to  a  its  map  R"a  by  R,  then 
to  add  the  map  of  this,  etc.  In  other  words,  the  closure  of  a  with  respect 
to  R  is 

a  W  (R"a)  W  {R"R"a)  W  {R"R"R"a)  W  •  •  • 

As  a  matter  of  fact,  the  notion  of  the  closure  of  a  with  respect  to  R  in  the 
sense  just  described  is  merely  a  special  case  of  the  notion  of  closure  discussed 
in  Sec.  5  of  Chapter  IX.  In  the  notation  proposed  there,  the  closure  of  a 
with  respect  to  R  would  be  denoted  by  Clos(a, xRz),  which  would  denote 


Now 
So 


n/3(a  Q  I3:.(x,z):x  e  ^.xRz.  D  .z  e  /3). 
\-  (^,R)::{x,z):X  e  ^.xRz.   D   .Z  e  13.:  =  :.R"l3  Q 


\-  {a,R):C\os{a,xRz).   =  .n/3(a  ^  I3.R"I3  Q  /5). 

Then  Thms.IX.5.12  to  IX.5.16  reduce  to: 

Theorem  X.4.33.  |-  {a,R,y):.y  eC\os{a,xRz):  =  ■.(I3):a  C  /3.  R"i3  C  ^. 
D  .y  el3. 

Theorem  X.4.34.  \-  (a,R).a  e  C\os(a,xRz). 

Theorem  X.4.35.  \-  {a,R).R"C\os(a,xRz)  C  Clos{a,xRz). 

Theorem  X.4.36.  \-  (a,l3,R):a  C  I3.R"(I3  n  C\os(a,xRz))  C  /5.  D  . 
Clos(a,xRz)  C  ^. 

Theorem  X.4.37.  |-  {a,R).C\os{a,xRz)  =  a  VJ  R''C\os(a,xRz). 


EXERCISES 


X.4.1.    Prove: 


I.  \-  (a,l3,y):y  9^A.  D.aClS^aXyCfiXy. 

II.  \-  (a,l3,y):y  9^A.  D.aC(3  =  yXaCyX/3. 

X.4.2.     Prove: 

I.  h  K/3).a  X  /3  =  («  X  V)|(V  X  13). 

II.  \-  (a,^).a  X  ^  -   (a  X  V)  n  (V  X  /8). 

III.  h  (oc,(3).a  X  ^  =  a](V  X  Y)\^. 

IV.  b  (a,/3).a  X  /8  =  alVf/5. 
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X.4.3.     Under  what  circumstances  would  one  have  \-  R  =  Y]R? 
X.4.4.     Prove  \-  {a,0,y,8).{a  X  ^)  H  (7  X  5)  =  (a  n  7)  X  (iS  H  5). 
X.4.5.     Prove: 

(a)  h  (R,S,T):R  QS.D  .R\T  C  S\T. 

(b)  h  {R,S,T):R  QS.D  .T\R  Q  T\S. 

(c)  h  ia,R,S):R  QS.D   .R"a  C  S''a. 

(d)  h  (R).A]R  =  A.  . 

(e)  h  (i2).i?tA  =  A. 

(f)  1-  {R).R"A  =  A. 

(g)  h  («).«1A  =  A. 
(h)  h  (/3).Ar/3  =  A. 
(i)  I-  (a).A"a  =  A. 
(j)  h  Cnv(A)  =  A. 

X.4.6.     Prove: 

(a)  h  iR,S).Arg{R  ^  S)  =  Arg(R)  W  Arg(5). 

(b)  h  (R,S).yaliR  US)  =  Ysi\(R)  W  Val(5). 

(c)  h  (R,S).AY(R  \JS)  =  AYiR)  w  AV(^). 

(d)  h  icx,^,R).R"{a  W  /3)   =   (i?"a)  W  (/2"^). 

(e)  h  (/^,>S).Arg(/2  n  .S)  C  Arg(i2)  n  Arg(,S). 

(f)  h  (i2,*S;).Val(i2  n  ^)  e  Yq\{R)  n  Val(;S). 

(g)  [-  (R,S). AY {R  nS)  Q  AY(R)  n  AY(S).    . 
(h)  h  ia,^,R).R'\<^  n  /3)  c  (i2"a)  n  (72"^). 

(i)  [-  {a,R,S)XR  ^  SY'a  =  (R"a)  KJ  {S"a). 

(j)  \-  ia,R,S).{cc]R)  n  (a]S)   =  a]{R  H  S). 

(k)  h  {^,R,S).(R\^)  ^  ('Sl/^)  =  (72  n  ,S)r/3. 

(1)  1-  (a,R,S).{a]R)  w  («1*S)  =  «l(/2  w  ;S). 

(m)  h  (/3,72,*S). (72^/3)  W  (.S[^)  =  (i^  W  S)\0. 

X.4.7.     Find  ilhistrations  to  show  the  falsity  of  each  of: 

(a)  {R,S).Aig{R)  r\  Arg(S)  Q  Arg(R  n  S). 

(b)  (R,S)..Ya\(R)  n  Val(*S)  C  Val(/2  n  S). 

(c)  {R,S).AY(R)  n  AV(5)  c  AV(i2  n  >S). 

(d)  {a,^,R)XR"<x)  r\  (R"^)  Q  R'\a  n  ^). 

X.4.8.     Prove: 

(a)  h  (E,5,r):i2|(^  n  T)  C  (72|5)  n  (i2|T). 

(b)  h  (^):^|A  =  A\R  =  A. 

(c)  1-  (R,S,T):R\iS  w  T)  =  (i?i>S)  w  (R\T). 

(d)  1-  {R,x):xRx.  D  .x(R\R)x. 

(e)  h  (^,'S):i^  C  ^.  D  .(R\R)  Q  (S\S). 

(f)  h  {c^AR,S)MR)  n  (^1>S)  =  (a  n  ^)1(i2  n  S). 
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(g)       \-  (a,M:^  7^  A.  D   .(a  X  /3)|(/3  X  y)  =  a  X  y. 
(h)     I-  (R,S).Yii\iR\S)  =  *S"Val(i^). 
(i)      h  (R,S).Arg(R\S)  -  /?'Arg(5). 

X.4.9.     Prove: 

(a)  \-  (0,R)::(x,z):X  e  ^.xRz.  D  .z  e  (3.:  =  :.i?"/3  C  ^. 

(b)  |-  (a,l3,R).R'\l3  r\  C\os(a,xRz))  C  C\os(a,xRz). 

X.4.10.  Whitehead  and  Russell  denote  the  ancestral  relation  of  R  by  R^, 
which  they  define  as 

uv(u  e  AV{R).v  e  C\os( {u} ,xRz)) 

where  u  and  v  do  not  occur  in  xRz.    Prove: 

(a)  h  iR,x,y)::xR^y.:  =  :.X  e  AV(R):.(^):X  e  ^.R"l3  ^  13.  D  .y  e  /?. 

(b)  \-  (R,x):x  e  AY{R).  =  .xR^x. 

(c)  [-  {R,x,y,z):xR^y.yRz.  D  .xR^z. 

(e)  [-(/2,w):MeAV(/^).  D  .R^''{u]  =  Clos{{u],xRz). 

(f)  [- (/2,m):.~  w  e  AV(i2):  D  :R^" {u}   =  A. C\os( { u} ,xRz)  =   {u}. 

(g)  h  (^,R,x,y):R"{0  n  i?"{x})  C  ^.x  e  /3  n  AV{R).xR^y.  D  .y  e  (3. 
(h)  [-  (/2,x):.a;  e  AV(72):  3  :{y).xR^y.  =  .x  =  y.v.x{RjR)y. 

(i)  h(^).(^l(^*))C/2^. 

(j)  h  (R)-R  C  (R,|R). 

(k)  h(R).R^(R|(R*))- 

(1)  h  (R).R  c  R,. 

(m)  h  (i?).Arg(/?,)  =  Val(i?,)  =  AY(R). 

(n)  h(^)-(^*)!(^^''*)  =^*- 

(o)  h  (^).(^*)*  =  ^*- 

(p)  h  (i2).Cnv(/^J  ^  (Cnv(i?))^. 

(q)    ^  (R).iRJ\R  -  R\iRJ- 

5.  Functions.  As  we  said  earlier,  we  are  restricting  the  term  "function" 
to  mean  "single-valued  function."  So  a  relaiion  R  is  a  function  if  and  only 
if 

{x,y,z):xRy.xRz.  D  .y  =  z. 

Accordingly  we  define 

Funct         for        'R{x,y,z):xRy.xRz.  D  .y  =  z. 

Thus  Funct  is  the  class  of  all  functions.  To  state  that  i?  is  a  function, 
we  write  R  e  Funct.  Because  we  used  in  the  definition  of  Funct  the  variable 
R  which  is  restricted  to  the  range  Rel,  every  function  is  a  relation  (see 
Thm.IX.7.9). 
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Funct  is  stratified  and  may  be  assigned  any  arbitrary  type  since  it  has 
no  free  variables. 

Functions  are  sometimes  referred  to  as  many-one  relations. 

If  R  is  a  function  and  x  e  Arg(R),  then  there  is  a  unique  y  such  that  xRy. 
In  the  standard  mathematical  notation,  this  y  is  denoted  bj^  R(3:),  or 
sometimes  by  just  Ra:  (as  when  R  is  cos,  log,  etc.).  We  shall  uniformly  use 
R{x)  to  denote  the  unique  y  such  that  xRy.    Thus  we  put 

A(B)         for        iy(BAy), 

where  y  does  not  occur  in  A  or  jB.  A{B)  is  stratified  if  and  only  if  5  e  A  is 
stratified,  and  if  stratified  has  the  same  type  as  B,  which  would  be  one  less 
than  the  type  of  A.  The  free  occurrences  of  variables  are  just  those  in  A 
and  B,  respectively. 

Most  mathematicians  will  agree  that  x^  is  a  function  of  x.  We  have  said 
that  a  function  is  a  class  of  ordered  pairs.  Is  then  x^  a  class  of  ordered 
pairs?  We  think  not.  If  a:  is  a  "real  variable,"  then  x^  is  a  variable, 
nonnegative,  real  number  got  by  multiplying  x  by  itself  and  can  scarcely 
be  a  class  of  ordered  pairs. 

Although  x^  is  not  itself  a  class  of  ordered  pairs,  it  determines  such  a 
class.  If  we  graph  y  =  x^,  the  points  of  the  graph  are  a  class  of  ordered 
pairs.  To  be  precise,  the  graph  consists  of  the  class  of  ordered  pairs 
I  {x,y)  \  y  —  x^],  and  when  one  speaks  of  x^  as  a  function,  it  is  precisely  this 
class  of  ordered  pairs  which  one  has  in  mind.  However,  x^  itself  is  certainly 
not  this  class  of  ordered  pairs. 

If  we  denote  temporarily  { {x,y)  \  y  =  x^}  hy  f,  then  by  our  definition  of 
f{x)  given  above  we  have 

Thus  with  {{x,y)  \  y  =  x^}  for/,  f(x)  is  x"^.  That  is,  the  relationship  be- 
tween x^  and  {{x,y)  \  y  =  x^}  is  the  same  as  the  relationship  between /(.r) 
and  /.  That  is,  x^  is  the  result  of  applying  the  operation  of  squaring  to  the 
variable  x,  whereas  { {x,y)  \y  =  x^}  denotes  the  actual  operation  of  squaring. 

It  would  appear  that  we  are  trying  to  discredit  the  common  belief  that 
a;^  is  a  function  of  x.  This  is  not  the  case  at  all.  We  are  merely  trying  to 
emphasize  rather  strongly  that  the  word  "function"  is  used  in  two  quite 
different  senses.  It  is  common  to  refer  to  x^  as  a  function,  and  it  is  also 
common  to  speak  of  {{x,y)  \  y  =  x^}  as  a  function  (indeed,  as  the  same 
function!!!).  Nevertheless,  x^  and  {{x,y)  \  y  =  x^}  are  quite  different 
objects.  Lest  there  be  any  lingering  doubts  on  this  point,  let  us  note  that 
unquestionably  x^  is  a  variable  and  unquestionably  {{x,y)  \  y  =  x^}  is  a 
constant. 

To  put  the  matter  in  terms  of  more  familiar  notation,  it  is  common  to 
speak  of /(x)  as  a  function  and  also  common  to  speak  of  /  as  a  function,  and 
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yet  certainly  f(x)  and  /  are  not  the  same.  It  is  unfortunate  that  there  is 
this  double  usage  of  the  word  "function."  It  is  perhaps  worth  while  to 
give  a  little  thought  to  the  historical  development  of  the  notion  of  function, 
so  as  to  see  how  the  present  state  of  affairs  came  about. 

We  go  back  to  Euler,  who  was  perhaps  one  of  the  first  mathematicians 
to  give  a  fairly  precise  specification  of  his  notion  of  a  function.  In  modern 
terms,  Euler's  notion  of  a  function  of  x  is  what  we  would  speak  of  as  a 
formula  involving  x,  such  as  x^,  sin  x,  etc.  It  is  perhaps  not  known  whether 
Eulcr  would  have  regarded 

''  r"^  dy 

as  a  function  or  not,  but  certainly  he  would  not  have  considered  as  a  func- 
tion the  modern  function  whose  value  is  unity  at  all  rational  points  and 
zero  at  all  irrational  points. 

Euler's  notion  of  function  was  soon  found  to  be  far  too  restrictive,  and 
it  was  generalized  to  somewhat  the  following: 

"If  a  variable  y  is  so  related  to  a  variable  x  that  for  each  value  of  a;  in  a 
range  R  there  is  determined  a  unique  value  of  y,  then  y  is  a  function  of  the 
variable  x,  defined  over  R,  and  we  WTite  y  =  f(x)." 

This  definition,  or  some  approximation  thereto,  is  to  be  found  in  the  usual 
modern  calculus  textbook.  Nevertheless,  this  definition  is  quite  unclear. 
When  y  is  an  explicit  formula  containing  x,  it  is  clear  how  there  can  be  a 
relation  between  x  and  y.  However,  the  definition  given  above  certainly 
intends  something  more  general.  Unfortunately,  if  y  is  to  be  thought  of  as 
an  abstract  variable,  it  is  hard  to  see  how  it  can  keep  its  values  sorted  out 
and  properly  related  to  the  values  of  x,  or  indeed  how  it  goes  about  having 
values  at  all.  That  this  point  is  far  from  clear  is  evident  if  one  looks  over  a 
large  number  of  calculus  textbooks  and  observes  the  very  confused  descrip- 
tions of  the  notion  of  function  which  appear  in  the  poorer  ones. 

In  order  to  clear  up  this  point,  there  evolved  the  modern  notion  of  a 
function  as  a  rule  for  determining  values  for  y  from  values  for  x  in  the  range 
R,  or  (which  amounts  to  the  same  thing)  as  a  class  of  ordered  pairs  {x,y}. 
This  is  a  very  precise  idea  and  is  very  satisfactory  for  the  development  of 
function  theory,  but  it  is  quite  a  different  idea  from  that  which  is  intended 
in  the  definition  quoted  earlier.  In  the  definition  quoted  earlier,  one  is 
thinking  of  f(x)  as  the  function,  whereas  in  the  modern  definition  of  a 
function  as  a  rule  (or  class  of  ordered  pairs) ,  one  is  thinking  of  /  as  the 
function. 

Actually,  /  and /(a;),  the  rule  and  the  result  of  applying  the  rule,  are  very 
different,  and  it  is  confusing  to  use  the  word  "function"  to  apply  to  both. 
One  should  definitely  decide  to  call  one  by  the  name  "function"  and  then 


308  LOGIC  FOR  MATHEMATICIANS  [Chap.  X 

devise  a  new  name  for  the  other.  The  present  trend  in  higher  mathematics 
is  to  reserve  the  name  "function"  for/,  but  even  those  who  advocate  this 
are  usually  inconsistent  in  their  use  of  the  word  "function."  Moreover, 
they  have  given  no  name  to  "f(x)."  Probably  this  trend  is  due  to  the 
difficulty  of  making  precise  the  definition  of  a  function  which  conceives  of 
f(x)  as  a  function. 

Let  us  return  to  this  definition  and  consider  more  carefully  the  difficulties 
which  it  raises.  We  are  to  conceive  of  the  variable  y  as  being  so  related  to 
a  variable  x  that  values  oi  x  m  R  determine  values  of  y.  Again  we  ask, 
what  manner  of  object  is  y?  For  instance,  it  is  clearly  intended  in  this 
definition  that  the  relation  between  x  and  y  could  be  such  that  y  has  the 
value  3  for  each  value  of  x  in  R.  But  then  y  is  not  a  variable.  Worse  still, 
if  the  range  R  oi  x  consists  of  a  single  point,  then  neither  x  nor  y  is  variable. 
Clearly,  what  is  intended  is  some  generalization  of  Euler's  idea  that  y  is  a. 
formula  containing  x.  This  would  permit  y  to  be  constant,  e.g.,  y  = 
(3  +  re)  —  X,  or  even  permit  the  range  of  x  to  be  a  single  point,  e.g., 
y  =  \/  —  x^,  where  in  real  variables  x  could  only  be  zero.  The  trouble  is 
that  classical  mathematics  has  no  precise  way  of  prescribing  such  a  gen- 
eralization. 

If  we  use  the  resources  of  symbolic  logic,  such  a  generalization  is  quite 
easy.  We  merely  conceive  of  ?/  as  a  term,  A,  containing  free  occurrences 
of  X.  This  will  include  all  the  familiar  "formulas"  of  mathematics.  It 
also  includes  definitions  by  cases,  as  we  saw  in  detail  in  Chapter  VIII. 
Thus  we  can  write  a  term  A ,  with  free  occurrences  of  x,  such  that 

\-  {x):X  is  rational.  D  .A  =  1. 
\-  {x):x  is  irrational.  D  .A  =  0. 

It  will  turn  out  that  we  can  write  terms  A  which  will  serve  as  f(x)  even 
in  the  most  difficult  cases,  such  as  when  f(x)  is  defined  by  transfinite  induc- 
tion. In  fact,  if  we  achieve  our  aim  of  constructing  a  symbohc  logic  in 
which  we  can  state  all  theorems  of  mathematics  and  prove  those  which  are 
generally  accepted  as  tme,  then  we  can  certainly  write  a  term  A  to  repre- 
sent any  given  function /(a:),  for  one  merely  translates  the  specification  of 
f{x)  into  symbolic  logic,  and  the  resulting  formula  of  the  logic  is  A. 

Hence  we  propose  to  make  the  familiar  definition  precise  by  replacing 
"variable  y"  by  "term  A."    Then  the  definition  would  read: 

"If  A  is  a  term  containing  no  free  occurrences  of  any  variable  other  than 
X,  then  A  is  a  function  of  the  variable  x." 

If  we  define /  =  { {x,y)  \  y  =  A],  then  we  have  as  a  theorem 

h  (x).f(x)  =  A 
if  A  contains  free  occurrences  of  x. 
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We  thus  have  available  precise  definitions  of  both  uses  of  the  word  func- 
tion, namely,  to  denote/  or  to  denote /(x).  Thus  we  can  call  R  a  function 
and  then  R(x)  =  cy  (xRy)  is  the  corresponding  value,  or  we  can  call  A  a 
function  and  then  {(x,y}  \  y  =  A}  is  the  rule/  which  determines  a  value 
of  A  for  each  value  of  x  (in  the  sense  that  \-  (x).f(x)  =  A).  Either  pro- 
cedure is  equally  precise  in  terms  of  symbolic  logic.  That  is,  in  symbolic 
logic  we  have  adequate  machinery  for  dealing  with  either  /  or  /(x)  with 
complete  precision,  and  so  are  at  liberty  to  designate  either  as  a  "function." 
But  we  must  then  use  a  different  name  for  the  other ! 

Actually,  a  perfectly  good  name  for  the  notion  /  is  available,  namely, 
"transformation."  In  algebraic  geometry,  a  careful  distinction  is  usually 
made  between  a  "transformation"  /  and  a  "general  value"  f(x)  of  the 
transformation.  Thus,  one  way  out  of  our  impasse  would  be  always  to 
refer  to  /  as  a  transformation  and  to  reserve  the  term  "function"  to  refer 
to/(x).    This  would  be  quite  satisfactory  if  it  were  generally  adopted. 

Another  possible  way  would  be  to  agree  to  refer  to/  as  a  "function"  but 
to  f(x)  as  a  "function  of  x."  This  sounds  rather  attractive  at  first  but 
would  probably  not  work.  In  the  first  place,  it  requires  that  we  treat  the 
phrase  "function  of  re"  as  an  indissoluble  unit,  which  is  contrary  to  the 
rules  of  English  grammar.  Also,  it  is  quite  certain  that  "function  of  x" 
would  usually  be  abbreviated  to  "function,"  and  we  would  be  back  to  the 
double  usage  of  "function"  which  now  prevails. 

There  is  a  third  solution  which  is  the  one  which  we  shall  adopt.  It  is  not 
highly  satisfactory,  but  it  does  seem  most  nearly  in  accord  with  the  present 
trend  of  mathematical  thought.  We  shall  refer  to  /  as  a  "function"  and  to 
f(x)  as  a  "function  value,"  or  more  fully  as  a  "function  value  of  x,"  or  still 
more  fully  as  a  "function  value  corresponding  to  x."  If  A  is  a  term,  such 
as  x^,  containing  no  free  variables  other  than  x,  we  shall  refer  to  A  also  as  a 
"function  value,"  or  "function  value  of  a:,"  or  "function  value  corresponding 
to  x"  because  in  such  case  we  can  always  define  a  function  /  to  be 
{{x,y)  \  y  =  A},  and  then  we  have  \-  (x).A  =  f{x). 

We  note  that  there  is  a  general  correspondence  between  functions  and 
function  values,  which  we  shall  illustrate  by  a  few  examples. 

Given  two  function  values,  f(x)  and  g(x),  we  can  construct  the  function 
value  which  is  their  sum,  namely, /(a:)  -j-  g(x). 

Correspondingly,  given  two  functions,  /  and  g,  we  can  construct  the 
function  which  is  their  sum,  namely, 

f  -^  9  =  {(x,y)  I  y  =  f(x)  +  g{x)]. 

Clearly  the  sum  of  the  function  values  is  related  to  the  sum  of  the  func- 
tions by  the  relation 

h  (^).(/  +  9)(^)  =  m  +  9(x), 
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which  is  usually  taken  as  the  definition  of  the  sum  of  two  functions  in  a 
function  space,  or  as  the  sum  of  two  transformations. 

There  are  two  notions  of  product  of  functions.  One  corresponds  to  the 
notion  of  product  of  function  values,  and  one  does  not.    Suppose  we  define 

f-g         as         \(x,y)  \  y  =  f{x)-g{x)], 

then  this  corresponds  to  the  product  of  function  values,  since 

l-(^).(/-^)(^)  =f{x)-g{x). 

On  the  other  hand,  if  we  define 

/  X  ^        as         \{x,ij)  I  y  =  g{f{x))], 
then 

V{x).UXg){x)  =g(f{x)). 

The  latter  is  the  type  of  product  which  is  used  when  we  speak  of  the  prod- 
uct of  two  transformations.  We  shall  see  that  it  is  identical  with  the 
notion /I  gr  which  we  introduced  for  the  product  of  two  relations. 

The  notion  of  derivative  is  another  instance  of  the  correspondence  be- 
tween functions  and  function  values.  Given  a  function  value,  f{x),  we 
can  construct  the  function  value  which  is  the  derivative  with  respect  to  x 
of  the  original  function  value,  namely, 

^  fix)  =  DMx))  =  lim  /(^  +  ^)  -  /(^)  . 

Correspondinglj^,  given  a  function,/,  we  can  construct  the  function  which 
is  the  derivative  of  the  original  function,  namely, 

r^/-        J-,        (/      \  I           T      f(x  -{-  h)  —  f(x). 
Df  =  f  =  {(x,y)  I  y  =  hm  ^ f ^-^}. 

Then  one  has 

■     h  (x).{Df)(x)  =  fix)  =  ~  fix)  =  DXfix)). 

We  note  one  point  here,  namely,  that  in  speaking  of  the  derivative  of  a 
function  value,  one  must  specify  what  variable  one  is  going  to  differentiate 
with  respect  to,  whereas  in  speaking  of  the  derivative  of  a  function,  the 
specification  of  the  variable  of  differentiation  is  automatic. 

Thus  we  cannot  speak  simply  of  the  derivative  of  x^,  since  we  may 
differentiate  with  respect  to  x,  getting 

-r  x^  ^  D^x~  —  2x, 
ax 
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or  we  may  differentiate  with  respect  to  2x,  getting 

(i  2     7-j  2     

a{2x) 
or  we  may  differentiate  with  respect  to  x^,  getting 

d 


d(x^) 


D^^x    =  1. 


The  point  is  that  one  may  think  of  a  given  function  value  as  being  a  func- 
tion value  for  each  of  many  different  functions.    Thus,  if  we  put 

/i  =  {{■^•,y)  I  y  =  ^'h 
/2  =  {{x,y)  I  y  =  ir', 
/s  ^  {(^^y) !  y  =  y\, 

then 

/i(.t)       =    x\ 

/a  (2a;)  =  x\ 
Mx')    =x\ 

Thus  x^  is  a  function  value  of  each  of/1,/2,  and /a,  and  indeed  of  many  other 
functions.    The  derivatives  of  the  functions /i,  /2,  /a  are  given  by 

DU  =fi  =  {{x,y)\y  =  2x\, 
Dh  =/^  =  {{x,y)\y  -  hx\, 
Dfz  =  jl  =   [{x,y)  I  y  =  l.x  =  x]. 
So 

(Dmx)  =  2x, 

{Df,){x)=hx, 

(Df,)(x)  =  1. 

These  accord  with  the  general  relation 

(D/Xx)  =  DMix)). 
However, 

{Df,)(2x)  =  x^  DM2i2x))  =  2x, 
(Df,)(x')   =  1  9^  DXUx'))    =  2x. 

The  error  of  writing  (Df){g(x))  for  D^(f(g(x)})  is  a  common  one  with 
beginning  students  in  calculus.  Thus,  if  /  is  "sin,"  then  Df  —  f  =  "cos." 
So 

(D  sin) (a:)  =  D^(sm  x)  —  cos  x, 
but 

(D  sm)(x^)  =  cos  x'^  7^  -Di(sin  x)  =  2x  cos  x. 
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The  process  of  finding  Z)^(/(g(x)))  is  accomplished  by  means  of  the  so- 
called  "chain  rule."  We  may  express  this  either  in  terms  of  function  values 
as 

dy  _  dy     du 
dx       du     dx  ' 

or  in  terms  of  functions  as 

Dig  Xf)  =  (gX  Df)  -  Dg, 

or  (as  is  sometimes  done)  partly  in  terms  of  function  values  and  partly  in 
terms  of  functions  as 

£f(9(x))  =ng{x))  '£g(x). 

The  trouble  with  the  form 

dy  ^dy  ^  du 
dx       du     dx 

is  that  it  does  not  seem  to  convey  to  the  beginning  calculus  student  the 
information  that 

(2)  -^  sin  a;^  =  2x  cos  x'^. 

.  dx 

For  the  average  beginning  student,  (1)  and  (2)  are  separate  and  unre- 
lated formulas. 

We  are  not  intending  to  suggest  that  the  other  forms  of  the  chain  rule 
would  be  more  useful  to  the  beginning  calculus  student,  but  merely  to 
indicate  the  difficulties  which  can  arise  because  of  the  fact  that  a  given 
function  value  can  be  a  function  value  for  each  of  many  different  functions. 
These  difficulties  are  greatly  increased  when  we  come  to  functions  of  several 
variables  and  try  to  deal  with  partial  derivatives  (we  shall  give  below  an 
illustration  of  a  problem  involving  partial  derivatives  which  is  particularly 
baffling  to  the  average  student).  This  situation  is  particularly  aggi'avated 
in  some  subject  such  as  thermodynamics,  where  it  is  usually  the  functions, 
rather  than  the  function  values,  which  are  significant,  but  in  which  it  is 
always  the  function  values  which  appear  in  the  formulas. 

Another  subject  in  which  there  is  a  delicate  interplay  between  the  use  of 
functions  and  function  values  is  the  subject  of  differential  equations.  This 
interplay  is  completely  concealed  by  the  notation,  which  is  traditionally 
in  terms  exclusively  of  function  values.  For  example,  the  average  student 
in  a  course  in  differential  equations  will  find  it  extremely  difficult  (if  not 
impossible)  to  prove  the  following  true  statement: 
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"If  yisa  solution  of  y"  —  2xy'  +  ny  =  0,  and  z  is  got  from  y  by  replacing 
nhy  —n  —  2  and  x  by  ix,  then 

z  e 

is  also  a  solution." 

In  passing,  we  might  say  a  word  about  differentials.  The  standard 
definition  is 

df{x)  =  f'(x)  dx. 

The  expression  on  the  right  is  a  function  value  of  three  independent  quan-> 
titles,  namely,  /,  x,  and  dx.  In  the  earliest  tradition  of  calculus,  before  a 
clear  notion  of  limit,  derivative,  etc.,  evolved,  it  was  considered  that  dy 
is  a  function  value  of  y  only.  It  is  quite  clear  that  f'{x)  dx  is  not  a  function 
value  of  y  alone,  but  numerous  attempts  have  been  made  to  interpret  it 
so,  and  controversies  have  raged  between  proponents  of  differing  interpre- 
tations. The  truth  simply  is  that  f{x)  dx  is  a  function  value  of  three 
independent  quantities  /,  x,  and  dx,  which  behaves  sufficiently  like  the 
traditional  symbol  dy  that  one  can  preserve  the  traditional  formulas  of 
calculus.  Perhaps  the  time  is  overdue  for  a  break  with  a  few  more  of  the 
traditional  formulas  of  calculus. 

Clearly,  an  important  fact  about  function  values  is  that,  unless  the  inde- 
pendent variable  is  carefully  specified,  one  cannot  claim  that  the  function 
value  determines  a  definite  function.  This  fact  is  widely  known  but  often 
carelessly  ignored.  This  may  explain  part  of  the  reason  for  concentrating 
on  the  function  rather  than  the  function  value  in  modern  mathematics. 
This  point  is  relevant  in  connection  with  such  notions  as  continuity, 
integrability,  etc.  When  applied  to  functions,  no  specification  of  the 
independent  variable  is  necessary,  but  when  applied  to  function  values,  the 
independent  variable  must  be  carefully  specified. 

Failure  to  make  such  specification  leads  to  such  confusing  remarks  as  the 
classic  remark  which  appears  in  a  certain  text  on  the  theory  of  functions 
of  a  real  variable,  and  which  says  that,  although  every  continuous  function 
of  a  continuous  function  is  a  continuous  function,  it  is  not  true  that  every 
integrable  function  of  an  integrable  function  is  an  integrable  function. 

Upon  looking  at  the  proof  given  for  this  remarkable  statement,  one  is 
able  to  see  that  its  author  was  using  the  word  "function"  in  the  sense  of 
"function  value"  and  that  the  meaning  of  the  statement  is: 

"If  f{x)  and  g{x)  are  continuous  function  values  of  x  with  respect  to  x, 
then  g(f{x))  is  a  continuous  function  value  of  x  with  respect  to  x,  but  it  is 
not  true  that  if  f(x)  and  g(x)  are  integrable  function  values  of  x  with 
respect  to  x,  then  g{f(x))  is  an  integrable  function  value  of  x  with  respect 
to  x." 
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"Using  the  fact  that  g(fix))  =  {f\g)(x),  the  above  statement  can  be  put 
much  more  simply  in  terms  of  functions  as  follows: 

"If  /  and  g  are  continuous  functions,  then  /|g  is  a  continuous  function, 
but  it  is  not  true  that  if  /  and  g  are  integrable  functions,  then  f\g  is  an 
integrable  function." 

In  most  calculus  texts  f(x)  is  called  a  "function"  rather  than  a  "function 
value."  As  we  said  earlier,  this  is  perfectly  acceptable  provided  that  one 
does  not  use  the  word  "function"  to  refer  to  /.  This  leaves  the  writer  of 
the  calculus  text  with  no  terminology  for  the  "function"  /.  One  could 
quite  properly  call  /  a  transformation,  but  there  is  a  prejudice  against  this, 
particularly  at  the  calculus  level.  As  a  result,  the  writer  of  the  calculus 
text  will  usually  prefer  to  express  formulas  in  terms  of  function  values 
rather  than  in  terms  of  functions.  A  notable  exception  is  the  formula  for 
Taylor's  series,  which  is  usually  written  by  means  of  the  function  notation 
as 

f(a  +  h)  =  f(a)  +  ^  r(a)  +  |  fia)  +  •  •  •  , 


instead  of  being  written  in  the  more  cumbersome  function-value  notation  as 

+  •••  . 


[.w.  =  [.u+A[lL+" 


2! 


This  usually'  confuses  the  students,  since  the  earlier  explanation  of  f'(x) 

as  -r-  fix)  is  likely  to  lead  the  student  to  think  of  -rf{a)  rather  than 
ax'  ax 

as  the  interpretation  of  f'{a).  This  is  why  there  is  nearly  always  some  stu- 
dent in  a  class  (usually  one  of  the  better  students)  who  inquires  why 
f'(a),  j"{a),  .  .  .  are  not  all  zero. 

In  elementary  calculus,  this  preoccupation  with  function  values,  to  the 
almost  total  exclusion  of  functions,  is  not  particularly  disadvantageous, 
except  in  connection  wdth  partial  derivatives.  There  the  need  for  accurate 
specification  of  what  the  arguments  of  the  function  values  are  and  what 
the  variable  of  differentiation  is,  and  what  the  variable  is  which  is  "held 
constant,"  make  it  much  more  suitable  to  deal  with  functions  rather  than 
function  values,  except  for  the  lack  of  any  suitable  terminology  for  doing 
so  at  the  calculus  level.  As  a  result  of  the  lack  of  care  in  the  usual  function- 
value  presentation  of  partial  derivatives,  the  average  student  is  completeh' 
unprepared  to  deal  with  such  problems  as  the  transformation  from  coordi- 
nates {x,y)  to  coordinates  {r,d),  with  subsequent  manipulation  of  such 
quantities  as 
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-—        (6  constant), 
oy 

etc. 

We  are  not  suggesting  that  one  should  go  to  the  other  extreme  and  try  to 
dispense  with  function  values  in  favor  of  functions  at  all  points.  Rather, 
we  favor  having  a  terminology  for  both  functions  and  function  values,  so 
that  one  can  use  whichever  is  most  suitable  to  the  occasion  at  hand. 

Such  terminology  is  gradually  coming  into  use.  In  the  meantime,  there 
is  the  question  how  one  should  teach  functions  in  a  calculus  course.  There 
is  one  calculus  book  on  the  market  (Randolph  and  Kac)  which  tries  con- 
sistently to  refer  to  /  rather  than  /(re)  as  a  function.  However,  they  lack  a 
name  for  j{x),  and  this  involves  them  in  many  difficulties.  All  other 
calculus  texts  that  we  know  of  refer  to  f{x)  as  a  function  and  have  no  name 
for  /.  In  teaching  from  such  a  text,  it  will  probably  be  best  to  go  along 
with  the  text  in  calling  J{x)  a  function.  One  can  explain  to  the  students 
that  /  stands  for  the  rule  which  determines  for  any  value  of  x  the  value  to 
be  assigned  to  f{x).  One  could  consistently  refer  to  /  as  the  "function 
rule"  of  the  "function"  j{x).  One  should  of  course  emphasize  most  strongly 
the  necessity  for  specifying  the  argument  variable  when  referring  to  a 
"function"  or  to  differentiation,  continuity,  etc.,  of  a  "function."  One 
should  probably  not  expect  any  but  the  best  students  to  comprehend  the 
distinction  between  the  derivative  of  a  "function  rule"  /  and  the  derivative 
with  respect  to  a:  of  a  "function"  f(x).  A  careful  treatment  of  such  subtle- 
ties, like  a  careful  treatment  of  limits  and  real  numbers,  must  probabl}^  be 
postponed  to  the  course  in  analysis,  at  which  time  one  can  change  termi- 
nology and  begin  referring  to  /  as  the  function,  using  some  suitable  (and 
different)  terminology  for  J{x). 

Such  a  program  is  very  unsound,  in  that  it  requires  the  student  to  learn 
two  contradictory  uses  of  the  word  "function."  It  is  to  be  hoped  that  a 
suitable  calculus  text  which  refers  to  /  as  a  function  and  has  a  suitable 
terminology  for  f{x)  will  soon  be  available,  or  else  that  the  present  trend 
in  higher  mathematics  toward  calling  /  a  function  will  be  reversed.  As 
we  have  repeatedly  said,  one  could  call  f{x)  a  function  and  /  a  transforma- 
tion.   However,  the  present  trend  is  not  in  this  direction. 

In  higher  mathematics  beyond  the  calculus,  it  becomes  much  more 
necessary  to  have  facility  in  the  use  of  both  functions  and  function  values. 
However,  even  in  higher  mathematics,  confusion  of  functions  with  function 
values  is  the  rule  at  the  present  time.  Even  writers  who  are  very  careful 
about  other  matters  allow  the  bad  habits  acquired  in  some  early  calculus 
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course  to  prevail  when  speaking  of  functions.  The  most  common  practice 
is  to  perpetuate  the  calculus  custom  of  calling  f(x)  a  function  and  having 
no  terminology  for  /.  When  one  wishes  to  deal  with  function  spaces,  such 
as  the  space  L^,  such  a  treatment  of  functions  is  very  awkward.  Alterna- 
tively, other  writers  set  out  to  use  "function"  to  mean  /  and  then  have  no 
terminology  for  f{x).  For  instance,  we  could  cite  one  excellent  and  (in  the 
main)  carefully  written  text  which  introduces  functions  with  the  statement : 

"The  notion  of  function  is  identical  with  the  notion  of  transformation 
of  one  space  with  another." 

This  is  an  unequivocal  statement  that  the  author  proposes  to  use  the 
word  "function"  in  our  sense,  as  denoting  /  rather  than  f(x).  However, 
his  old  habits  are  too  strong  for  him,  and  only  eight  pages  later,  we  find 
him  using  and  defining  the  statement  "the  function  f(x)  tends  to  6  as  re 
tends  to  a." 

It  would  be  very  awkward  to  make  this  statement  in  terms  of  functions 
rather  than  function  values,  and  we  approve  of  using  function  values  rather 
than  functions  in  the  statement.  However,  after  the  author  has  reserved 
the  word  "function"  to  denote  /,  he  should  not  use  it  to  denote  f(x).  If  the 
statement  were  written  "the  function  value  f(x)  tends  to  b  as  x  tends  to  a," 
it  would  be  quite  unobjectionable. 

Here,  as  usually,  the  source  of  the  difficulty  is  in  failing  to  provide  sepa- 
rate terminologies  for  /  and  f{x). 

If  one  has  a  function  /  and  an  argument  x,  there  is  the  standard  notation 
for  the  corresponding  function  value,  to  wit,  f{x).  The  problem  is,  given  a 
function  value,  x^,  to  write  a  formula  for  the  corresponding  function.  We 
have  been  using  {{x,y)  \  y  =  x^},  which  is  adequate,  but  space-consuming. 
Following  a  suggestion  of  Church,  we  define 

Xx(^)        for         {{x,y)  I  y  =  A.x  =  x} 

where  ^  is  a  variable  not  appearing  in  A.    Then 

h  (Xx(A))ix)  =  A. 

The  additional  "x  =  x"  which  we  have  inserted  on  the  right  side  plays  no 
role  except  to  ensure  that  both  x  and  y  have  free  occurrences  on  the  right 
(even  when  A  contains  no  free  occurrences  of  x)  so  that  our  conventions 
as  to  the  meaning  of  {{x,y)  \  y  =  A.x  =  x]  will  make  it  denote  the  class  of 
all  ordered  pairs  {x,y)  such  that  y  =  A.  Thus,  we  have  the  notation 
\x{x^)  for  the  function  determined  by  the  function  value  x^  considered  as  a 
function  value  of  the  independent  variable  x,  and 

h  ix).{\x(x'))ix)  =  x', 
and  in  general 

\- i\x{x'))(B)  =  B\ 

M  hero  ]i  is  any  term. 
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The  decision  to  reserve  the  word  "function"  to  denote  /  and  use  some 
other  term,  such  as  "function  value"  to  denote  f{x)  will  require  various 
changes  in  our  familiar  treatment  of  mathematical  entities  if  one  is  to  be 
consistent.  Thus  it  is  standard  procedure  to  consider  a  sequence  as  a 
function  of  positive  integers.  That  is,  we  identify  a„  with  a(n).  Then  a„ 
would  be  a  function  value  rather  than  a  function.  The  sequence,  being  the 
corresponding  function,  would  be  denoted  by  \n(an).  It  is  significant  that 
some  writers  feel  the  distinction  between  the  sequence,  \n{a„),  and  an 
arbitrary  term  of  the  sequence,  a„,  sufficiently  to  write  {a„}  to  denote  the 
sequence,  and  to  distinguish  carefully  between  {a„}  and  a„. 

Although  inconsistency  in  the  use  of  the  word  "function"  is  common, 
inconsistency  in  the  use  of  the  symbols  /  and  f(x)  is  rare.  An  exception 
is  in  the  treatment  of  group  characters.  A  group  character  is  a  function 
whose  argument  range  is  the  set  of  members  of  the  group.  Nevertheless, 
it  is  current  practice  to  denote  a  group  character  by  some  such  symbol  as 
x(s),  where  s  denotes  a  variable  which  takes  the  members  of  the  group  as 
values.  Thus,  x(s),  which  is  a  symbol  for  a  function  value,  is  used  to  denote 
a  function.  This  is  quite  confusing.  Clearly  x(s)  depends  on  s,  since  by 
giving  different  values  to  s,  we  get  different  values  for  x(s).  However,  the 
group  character  depends  only  on  the  group  and  the  particular  representa- 
tion of  the  group  which  is  involved;  in  fact,  we  have  the  theorem  that 
(under  appropriate  hypotheses  as  to  the  nature  of  the  group),  "two  repre- 
sentations are  identical  when  and  only  when  their  characters  are  the  same." 
Thus  a  notation  such  as  x{s),  which  certainly  depends  upon  s,  is  clearly 
unsuitable  for  referring  to  a  group  character.  The  obvious  solution  is  to 
speak  of  x  as  the  group  character  and  use  x(s)  only  when  one  wishes  to 
speak  of  the  value  of  x  for  the  argument  s  (this  happens  occasionally). 

Another  case  in  which  one  uses  f{x)  where  /  should  be  used  is  in  Laplace 
transforms.    A  standard  notation  is  to  write  F(s)  =  £(f(t))  to  denote 

F(s)  =    fe-''  j{t)dL 

Clearly  a  more  suitable  notation  would  he  F  =  £(/),  since  it  is  really  the 
functions  rather  than  the  function  values  which  are  transformed  by  the 
Laplace  operator.  This  becomes  clear  if  one  tries  to  ascribe  any  connection 
between  the  variables  s  and  t  in  F(s)  =  £(f(t)).  In  any  genuine  relation 
between  function  values,  such  as  (d/dx)x^  =  2x,  the  x's  on  one  side  of  the 
equation  must  have  some  connection  with  the  x's  on  the  other  side.  There 
can  be  no  such  connection  between  the  s  and  t  in  F(s)  =  £(f(t)),  and  so 
this  might  better  be  written  F  =  £(/). 

If  we  differentiate  the  formula  which  this  represents,  we  get 


»  =    f  e-'\-tf{t))dU 

Jo 
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This  is  commonly  written 

r{s)  =  £(-tf(t)). 

In  fact,  books  on  Laplace  transforms  commonly  state  as  a  theorem: 

IfF(s)  =  £{f{t)),  then  F'(s)  =  £{-tf(t)). 

Without  a  notation  for  the  function  associated  with  a  function  value,  it 
would  be  difficult  to  write  this  theorem  properly.  However,  using  the 
notation  which  we  suggested,  we  can  write  it  as: 

If  i^  =  £(/),  then  DF  =  £(U(-t  f{t))). 

Indeed,  one  can  write  it  more  simply  still  as : 

D£{f)  =  £(\ti-tf{t))). 

A  few  careful  writers  write  {F(s)}  =  £{f{t)},  the  {  }  indicating  that 
one  is  dealing  with  the  function  rather  than  the  function  value.  In  fact, 
one  may  conjecture  that  for  such  writers  the  following  definition  is  tacitly 
in  effect : 

If  A  is  a  term  containing  free  occurrences  of  one  and  only  one  variable, 
X,  then  {A}  shall  denote  Xx(A). 

We  could  not  use  such  a  notation  ourselves,  since  it  would  collide  with 
another  interpretation  for  {A}  which  we  have  already  adopted.  Also,  it 
is  much  less  flexible  than  the  notation  \x{A),  which  permits  that  A  not 
contain  free  occurrences  of  x  at  all  (so  that  one  can  get  a  constant  function) 
or  that  A  contain  additional  free  variables  (so  that  the  function  Xx{A)  can 
depend  on  various  parameters).  Also  there  is  apparently  a  reluctance  to 
use  {F{x)}  interchangeably  with  F,  {F'(x)}  interchangeably  with  Z)i^,  etc. 
This  latter  is  a  minor  point  but  results  in  a  considerable  loss  of  flexibility. 
There  is  also  the  difficulty  that  in  {F(x) }  it  must  be  clear  that  a;  is  a  variable 
and  F  is  not. 

When  one  deals  with  functions  as  well  as  function  values,  it  is  very  easily 
seen  that  D  is  a  function  whose  function  values  are  derivatives  of  functions. 
Thus  we  can  define 

D=  {(f,g)\g  =  f'}  =x/(/0, 
and  then  have 

h  (f).D(f)  =  Df  =  f. 

Because  the  arguments  and  values  of  D  are  functions,  D  is  often  referred 
to  as  an  ''operator"  rather  than  as  a  "function."  This  is  probably  due  in 
part  to  the  confusion  in  the  use  cf  the  word  "function."  Similarly,  the  £ 
of  the  Laplace  transform  is  called  an  operator  rather  than  a  function.  In 
our  sense  of  the  word  "function,"  an  operator  is  a  perfectly  good  function 
but  rather  specialized. 

In  case  /  and  x  are  both  variables,  then  f{x)  contains  free  variables  other 
than  X,  namely,  /,  and  is  not  strictly  a  function  value  of  x,  any  more  than 
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X  -\-  y  is.  Rather  f(x)  is  a  function  value  of  the  two  argument  variables  / 
and  X,  just  as  x  -\-  y  is  a,  function  value  of  the  two  argument  variables  x 
and  y.  However,  in  the  combination  f(x),  it  is  usual  to  think  of  /  as  stand- 
ing for  some  particular  function,  and  not  a  variable,  and  x  as  standing  for 
(or  being)  a  variable.  In  such  case,  it  is  proper  to  refer  to  f(x)  as  a  function 
value  of  X.  This  is  analogous  to  x  -\-  a,  where  a  is  thought  of  as  a  constant 
but  a;  as  a  variable.  Then  x  -}-  ais  a  function  value  of  the  argument  vari- 
able x;  however,  as  long  as  a  remains  unspecified,  we  do  not  know  just  what 
function  a:  +  a  is  a  value  of,  any  more  than  we  know  what  function  f{x) 
is  a  function  value  of  as  long  as  /  remains  unspecified. 

We  can  write  '\x(x  +  y),  and  then  we  have  a  function  depending  on  the 
parameter  y.  In  such  a  case,  mathematical  tradition  would  require  that 
we  use  Xxix  +  a)  instead,  but  clearly  the  difference  is  one  of  notation  only. 
The  function  '\x(x  +  y)  is  the  function  of  ''adding  ?/." 

The  fact  that  we  allow  in  the  symbolic  logic  terms  wdth  no  meaning  leads 
to  minor  discrepancies  between  our  terminology  and  that  current  in  mathe- 
matics.   This  comes  about  as  follows. 

By  Thm.VII.2.2, 

h  (f,x)(E,z).z  =  f(x). 

This  would  seem  to  say  that  f{x)  is  defined  and  unique  for  every  /  and 
x.    Such  is  not  the  case  at  all. 

We  recall  that  f{x)  is  ty  (xfy).  In  Chapter  VIII,  W'e  pointed  out  that, 
for  any  F(y),  we  could  form  the  combination  ly  F{y)  and  prove  various 
insignificant  properties  of  cy  F(y)  by  means  of  Axiom  schemes  8,  9,  and  10. 
However,  to  prove  any  really  significant  properties  of  ly  F(y),  we  must 
have  (El?/)  F(y),  so  that  we  can  use  Axiom  scheme  11  to  infer  F(Ly  F(y)). 

So  it  is  with  f(x).  In  case  we  have  (E^y)  xfy,  then  we  can  use  Axiom 
scheme  11  to  infer  that  xj{f{x)).  In  such  case  j{x)  exists  in  the  usual 
mathematical  sense  of  the  existence  of  a  function  value.  Alternatively  one 
says  that  j{x)  is  defined  at  x. 

The  fact  that  Axiom  schemes  8,  9,  and  10  give  J{x)  a  sort  of  spurious 
existence  for  any  a:  or  /  is  a  trifle  confusing.  What  it  means  is  that  "exist- 
ence of  f{x)"  in  the  mathematical  sense  is  not  actually  a  statement  about 
fix),  despite  its  misleading  grammatical  form,  but  rather  a  statement 
about  /  and  x  separately.  Thus  the  statement  '7(^)  is  defined  at  x" 
would  be  rendered  symbolically  as  (Ei?/)  xjy,  and  in  this  form  does  not 
contain  the  combination  j{x)  at  all. 

Thus  if  we  are  dealing  with  real  variables  and  put 

/  =  {{x,y)  I  y  >  O.y'  =  x}, 

then 

h  {x){E,z).z  =  fix). 
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However,  only  for  a;  >  0  do  we  have  (E^y)  xjy,  and  so  only  for  a;  >  0  do 
we  have  xf(f(x)).    That  is,  we  have 

h  (x):x  >  0.  D  .fix)  >  0.{Kx)f  =  X, 
but  not 

h  {x).f(x)  >  0.{fix)Y  =  X, 
although  we  do  have 

[-(x)(E,z).z  =  fix) 
as  well  as 

|-  ix):x  >  0.  D  .(Ei0).2  =  fix). 

So  "fix)  is  defined"  only  for  a:  >  0.  For  this  /,  the  values  of  x  for  which 
"fix)  is  defined"  are  just  Arg(/),  which  in  the  case  of  a  function  is  called 
the  range. 

If  we  wish  to  restrict  the  range  of  "existence"  of  fix)  to  some  class  a,  we 
have  only  to  replace  /  by  a]f. 

There  is  one  unfortunate  aspect  of  the  notation  y  =  R(a;),  and  this  is 
that  it  reverses  the  accustomed  order  of  x  and  R  in  xRy.  Thus  for  a  func- 
tion R  we  have  (for  x  in  Arg(R),  where  "R(R)  is  defined") 

y  =  Rix).  =  .xRy, 

in  which  x,  R,  and  y  run  in  reverse  order  on  the  two  sides  of  the  equivalence. 
One  could  have  avoided  this,  as  Whitehead  and  Russell  did,  by  always 
writing  xRy  in  the  reverse  order  yRx.    However,  this  would  require 

(x,y)  e  R.  =  .yRx, 

and  the  reversal  of  order  here  seems  even  worse.  As  a  matter  of  fact, 
mathematicians  habitually  derive  points  ix,y)  from  equations  y  =  fix), 
and  are  accustomed  to  the  reversal,  and  so  we  have  followed  the  current 
mathematical  practice.  If  R  and  S  are  functions,  then  the  reversal  appears 
again  in  the  fact  that  (R|S)(a;)  =  S(R(x)).  This  same  reversal  appears  in 
mathematics,  in  that  if  one  has  transformations  /  and  g  and  forms  the 
product  transformation  fg  (which  is  just  f\g),  then  fg  transforms  the  point  x 
into  gifix)). 

We  turn  now  to  formal  properties  of  functions, 
**Theorem  X.5.1.     |-  iR)::R  e  Funct.:  =  :.R  e  Re\:ix,y,z):xRy.xRz.  D  , 
y  =  z. 

Corollary  1.     \-  Funct  Q  Rel. 

Corollary  2.     |-  (R):.R  e  Funct:  =  :ix,y,z):xRy.xRz.  D  .y  =  z. 

Corollary  3.     \-  (R):.R  e  Funct:  =  :ix,y,z):xRz.yRz.  D  .x  =  y. 

Theorem  X.5.2.     [-  (R):.R  e  Funct:  =  :ix):x  e  Arg(R).  -  .i'^,y).xRy. 

Proof.     Assume  R  e  Funct.    Then  iy,z):xRy.xRz.  D  .y  =  z.    So 

(Ey).xRy::  =  -.-.(Ey)  xRy:.iy,z):xRy.xRz.  D  .y  =  z. 
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So  by  Thm.X.4.1,  Part  I,  and  Thm.VII.2.1, 

X  e  Arg(R).  =  .(Eiy).xRy. 

Conversely,  assume  the  latter  and  let  xRy.xRz.  Then  x  e  Arg(R).  So 
(E,y)  xRy.  That  is,  (Ew){y):iv  =  y.  =  .xRy.  So  by  rule  C,  (y):W  =  y. 
=  .xRy.    Hence  w  =  y  and  w  =  z,  and  so  y  =  z. 

Corollary.     |-  (R):.R  e  Funct:  =  :(?/):?/  e  Val(R).  =  .{EiXJ.xRy. 

This  theorem  says  that  for  a  function  /  the  range  of  arguments  is  just  the 
set  of  values  of  x  for  which  "f(x)  is  defined." 
**Theorem X.5.3.     ^ {R,x):.R  eYunct.x  eArg{R):  D  :{y):y  =  R{x).  =  .xRy. 

Proof.     Use  Thm.X.5.1,  Cor.  1,  Thm.X.5.2,  and  Axiom  scheme  11. 

Corollary  1.     \-  {R,x):R  e  Funct.a;  e  Arg(/^).  D  .xR{R{x)). 

Corollary  2.     |-  iR,y):.R  e  Funct.z/  e  Val(i2):  3  :(rc):  a;  =  /^(i/).  =  .xRy 

Corollary  3.     \-  {R,y):R  e  Funct.?/  e  Val(i2).  D  .{R{y))Ry. 
^Corollary  4.     |-  {R,x):R  e  Funct.a;  e  Arg(i2).  D  .R{x)  e  Val(i2). 
^Corollary  5.     \-  {R,x):R  e  Funct.?/  e  Ya\{R).  D  .Riij)  e  Arg(i2). 

Theorem  X.5.4.     \-  (R,y).y  e  Val(i^).  D  .y{R\R)y. 

Proof.  Assume  y  e  Val(i2).  Then  (Ea;).a;^?/.  So  {Ex).yRx.xRy.  That 
is,  2/(^|i?)2/. 

Corollary,     [-  {R).ATg{R\R)  =  Val(^|/2)  =  Ya\{R). 

Theorem  X.5.5.     [-  {R,y,z):R  e  Funct. y(R\R)z.  D  .y  =  z. 

Proof.  Assume  R  e  Funct  and  y(R\R)z.  Then  by  rule  C,  yRx.xRz.  So 
xRy  and  a;i?2;.    So  y  =  z. 

Corollary  1.     |-  (R):.R  e  Funct:  D  :(y,z):y{R\R)z.  =  .y  =  z.y  e  Val(i2). 

Corollary  2.     ^  {R):R  ^  Funct.  D  .^|/2  e  Funct. 

Corollary  3.     |-  {R,y):R  e  Funct.?/  e  Val(i2).  D  .?/  -  (^|/2)(?/). 

This  tells  us  that  sin  (arcsin  x)  =  x  regardless  of  the  determination  of 
arcsin  x.  Also,  if  /  is  a  continuous  function  and  we  define  J  f{x)  dx  as  an 
antiderivative,  then 


£//(^' 


dx  =  fix) 

regardless  of  the  constant  of  integration  used  in 

f{x)  dx. 


/ 


Theorem  X.5.6.     [-  (R,a,l3):R  e  Funct./3  C  R"a.  D  .^  =  R'\(R"^)  n  a). 

Proof.  Assume  R  e  Funct  and  /3  Q  R"a.  Let  y  e  13.  Then  ?/  e  i?"Q:. 
So  by  Thm.X.4.22  and  rule  C,  xRy.x  e  a.  So  xRy.x  e  a.xRy.y  e  13.  So  by 
Thm.X.4.23,  corollary,  xRy.x  e  a.x  e  R"^.  So  xR^j.x  e  (^"/3)  n  a.  So 
y  €  R"{(R"I3)  n  a).  Conversely,  assume  y  e  R'\{R"^)  n  a).  Then  by 
rule  C,  xRy.x  e  (5"/3)  n  a.  So  by  rule  C,  xRy.x  e  a.xRz.z  e  /3.  So  ?/  =  s. 
2  e  /8.    So  y  el3. 
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Corollary.  \-  {R)::R  e  Fimct.:  D  :.(«,^)./3  C  Wa-.  =  :(E7).7  C  a.^  = 
R"y. 

Proof.  To  go  from  left  to  right,  take  7  to  be  (R"^)  n  «.  To  go  from 
right  to  left,  use  Thm.X.4.25. 

From  the  point  of  view  of  transformation  theory,  R"a  is  the  map  of  the 
class  a  by  the  transformation  R.  Our  corollary  states  that,  if  /3  is  included 
in  the  map  of  a,  then  /3  is  the  map  of  some  7  which  is  included  in  a.  More- 
over, the  theorem  names  such  a  7,  to  wit,  {R"l3)  n  a.  In  general,  such  a  7 
need  not  be  unique,  though  if  R  is  univalent  (meaning  that  R,R  e  Funct), 
then  the  7  will  be  unique. 

Theorem  X.5.7.     \-  {R,S):R  e  Funct.>S  Q  R.  D  .S  e  Funct. 

Proof.     x4LSSume  R  e  Funct  and  S  Q  R.    Then  R  e  Rel  and  so  aS  e  Rel. 
Now  let  xSy.xSz.    Then,  by  S  Q  R,  xRy.xRz.    Sohy  R  e  Funct,  y  =  z. 
^Corollary.     [-  {R,a,^):R  e  Funct.  D  .a]R,  R\(3,  a]R\(3  e  Funct. 

Theorem  X.5.8.     h  {R,S):R  e  Funct.*S  Q  R.  D  .S  =  Arg{S)]R. 

Proof.  Assume  R  e  Funct  and  S  Q  R.  Let  xSy.  Then  x  e  Arg(»S)  and 
xRy.  So  x(Arg(^)li?)z/.  Conversely,  let  xiArg(S)]R)y.  Then  by  rule  C, 
xSz  and  xi^?/.    So  x/^^.    Hence  y  ^  z.    Hence  .T.S?/. 

This  tells  us  that  a  relation  S  is  included  in  a  function  R  if  and  only  if  S 
is  a  function  got  by  restricting  the  range  of  definition  of  the  function  R. 
**Theorem  X.5.9.     \-  iR,S):R,S  e  Funct.  D  .R\S  e  Funct. 

Proof.  Assume /^,S  e  Funct.  By  Thm.X.4.17,  Part  II,  jR|>S  e  Rel.  Now 
let  x(R\S)y  and  x(R\S)z.  By  rule  C,  xRu.uSy  and  xRv.vSz.  So,  since 
R  e  Funct,  u  ^  V.    So  uSy  and  waSs;.    So,  since  S  e  Funct,  y  =  z. 

This  tells  us  that  the  product  of  two  transformations  is  a  transformation, 
since  R\S  \^  just  the  product  of  the  transformations  R  and  S.  With 
Thm.X.4.18  telling  us  that  the  multiplication  of  transformations  is  associ- 
ative, and  with  xy{x  =  y)  serving  as  the  identity  transformation,  we  see 
that  the  set  of  all  transformations  of  a  set  into  itself  forms  a  semigroup. 

We  define  the  identity  transformation,  /,  taking 

/         for         xij{x  =  y). 

I  is  stratified  and  may  be  assigned  any  desired  type,  since  it  has  no  free 
variables. 

Theorem  X.5.10. 

I.  \-  {x,y)^xly.  =  .x  =  y. 
II.  yie  Funct. 
HI.  VI  =  I. 

IV.  \r  Arg(/)  =  Val(J)  =  V. 
V.  [-  {x)J{x)  =  X. 
**Theorem  X.5.11.     \-  (R).R  =  RjZ  =  /|R. 

Proof  0/  1-  R  =  R|/. 
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|-  xRz.  =  .  (Ey)  .xRy.y  =  z 
.  =  .{Ey).xRy.yIz 
.  ^  .x(R\I)z. 

The  proof  of  |-  R  =  7|R  is  similar. 

This  theorem  tells  us  that  multiplication  by  the  identity  transformation 
leaves  any  transformation  unchanged. 

Theorem  X.5.12.     \-  (a). a] I  =  /fa  =  a]I\a. 
Proof. 

\-  X  e  a.X  =  y.  =  .X  =  y.y  e  a. 

=  .X  e  a.X  =  y.y  e  ex. 

Corollary,  h  (R):R  ^  Funct.  D  .R\R  =  Ya\{R)]I  =  I\ya\(R)  = 
Val(i^)1/tVal(i2). 

Use  Thm.X.5.5,  Cor.  1. 

Theorem  X.5.13.  \-  (R)::R  e  Funct.:  D  :.{S,x,y):x(R\S)y.  =  .x  e  Arg(R). 
{R(x:))Sy. 

Proof.  Assume  R  e  Funct.  Now  let  x(R\S)y.  Then  by  rule  C,  xRz  and 
zSy.  ^oxeArg(R).  Then  by  Thm.X.5.3,  2  =  i2(a;).  So  {R{x))Sy.  Con- 
versely, let  X  e  ArgiR)  and  iR{x))Sy.  Then  by  Thm.X.5.3,  Cor.  1, 
xR(R(x)).    Sox(R\S)y. 

Corollary.  \-  (R,x):.R  e  Fund. x  e  Arg(R):  D  :(S,ij):xiR\S)y.  -  .(R(x))Sy. 

Theorem  X.5.14.  \-  (R,x):.R  e  Funct.rr  e  Arg(R):  D  :{S).{R\S){x)  = 
S{R{x)). 

Proof.  Assume  R  e  Funct  and  x  e  Arg(R).  So  by  Thm.X.5. 13,  corollary, 
(y).x(R\S)y  =  {R{x))Sy.  So  by  Axiom  scheme  9,  ly  {x(R\S)y)  = 
ly  ((R{x))Sy).    That  is,  (R\S)(x)  =  S(R(x)). 

This  theorem  merely  verifies  what  we  have  said  several  times  already, 
that  the  product  fg  of  two  transformations  /  and  gf  is  a  transformation  h 
such  that 

h(x)  =  g(fix)). 

The  fact  that  we  do  not  require  that  ^  be  a  transformation  is  a  convenient 
peculiarity  due  to  our  method  of  treating  t.  For  fg  to  be  a  transformation, 
it  will  usually  be  necessary  for  both  /  and  g  to  be  transformations. 

Theorem  X.5.15.     |-  {a,R,x):R  e  Funct.a:  e  a  n  Arg(R).  D  .R{x)  e  R:'a. 

Proof.     Assume  R  e  Funct  and  x  e  a  r\  Arg(R).    Then  by  Thm.X.5.3, 
Cor.  1,  xR(R{x)).x  e  a.    So  by  Thm.X.4.22,  R(x)  e  R"a. 
**Theorem  X.5.16.    [-  {R,S)::R,S  e  Funct.Arg(/2)  =  Arg(*S):.(.r):.r  e  Arg(R). 
D  .R{.v)  =  S{x).:  D  :.R  =  S. 

Proof.  Assume  the  hypothesis,  and  let  xRy.  Then  x  e  ATg{R).  Then 
R(x)  =  S(x)  by  hypothesis,  and  y  =  R{x)  by  Thm.X.5.3,  so  that  y  =  S(x). 
Now  from  x  e  ATg(R),  we  get  x  e  Arg(*S)  by  hypothesis,  thence  xS(S(x))  by 
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Thm.X.5.3,  Cor.  1.  Combining  with  y  =  S{x)  gives  xSy.  As  R,S  e  Funct, 
we  have  R,S  e  Rel,  and  so  R  '^  S.    One  proves  S  Q  R  similarly. 

In  words,  this  theorem  states  that,  if  two  functions  are  defined  for  the 
same  set  of  arguments,  and  always  take  equal  values,  then  the  functions 
are  the  same.  This  provides  the  most  commonly  used  means  of  proving 
two  functions  equal. 

Theorem  X.5.17.    \-  (R):R  e  Funct.  ^  .RUSC(R)  e  Funct. 

Proof.  Assume  R  e  Funct  and  let  x(RUSC(R))2/  and  x(RUSC(R))0.  By 
rule  C,  uRv.x  =  {u}.y  =  {v}  and  u'Rv'.x  =  {u'}.z  =  {v'}.  Then  {u}  = 
{u'].  So  u'  =  u.  So  uRv'.  So  y  =  v'.  So  y  —  z.  Conversely  let 
RUSC(R)  €  Funct,  and  let  xRy  and  xRz.  Then  {a:}(RUSC(R)){?/}  and 
{a:KRUSC(R)){2l.    So  {y}  =  {z}.    So  y  =  z. 

Theorem  X.5.18.  \-  (R,x):R  e  Funct.x  €  Arg(R).  D  .(RUSC(/2))({:c})  = 
{Rix)}. 

Proof.  Assume  R  e  Funct  and  x  e  Arg(i2).  Then  {x}  e  USC(Arg(72)). 
So  by  Thm.X.4.29,  Part  I,  {x]  e  Ai-g(RUSC(i2)).  Also,  by  Thm.X.5.17, 
RUSC(i?)  €  Funct.    So  by  Thm.X.5.3 

(1)  iyUx](RVSC(R))y  ^y=  (RVSC(Rm{x}). 

By  the  hypothesis  and  Thm.X.5.3,  xRiR(x)).    So  {a;}(RUSC(i2)){22(x)}: 
Soby(l),  {R(x)}  =  (RUSC(E))({a:}). 

Of  considerable  importance  are  univalent  transformations  or  1-1  rela- 
tions, that  is,  relations  R  such  that  R,R  e  Funct.    So  we  define 

1-1        for        R(R,R  e  Funct). 

Clearly  1-1  is  stratified,  and  may  be  assigned  any  type,  since  it  contains  no 
free  variables. 

Theorem  X.5.19.     \-  {R):R  e  1-1.  =  .i2,^  e  Funct. 

Corollary  1.    h  1-1  Q  Funct. 

Corollary  2.    |-  {R):R  e  Funct.  D  .R\R  e  1-1. 

Proof.  Use  Thm.X.5.5,  Cor.  2,  to  get  R\R  e  Funct.  But  \-  Cny(R\R)  = 
R\CnY{R)  =  R\R,  so  that  also  Cnv(R\R)  e  Funct. 

Corollary  3.     }-  {R):R  e  1-1.  3  .i^|^,  R\R  e  1-1. 

Corollary  4.     h  (R,S):R  e  l-l.S  Q  R.  D  .S  e  1-1. 

Corollary  5.     \-  (a,0,R):R  e  1-1.  D  .a]R,R\^,  «l^f^  «  1-1. 

Corollary  6.     h  (R,S)iR,S  e  1-1.  D  .R\S  €  1-1. 

Corollary  7.     h  (R):R  ^  1-1.  =  .R  e  1-1. 

Corollary  8.     \r  I  e  1-1. 

Corollary  9.     ^  (R):R  e  1-1.  =  .RUSC(R)  e  1-1. 

Theorem  X.5.20. 

I.  \-  {R,x):R  e  1-l.rc  e  Aig{R).  D  MR(x))  =  ^^ 
*II.  h  {R,y):R  €  1-l.y  e  Ya\(R).  D  .R(R(y))  =  y. 
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Proof  of  Pari  I.     Assume  R  e  1-1  and  x  e  Arg(72).    Then  by  Thm.X.5.14, 

R(R(x))  =  (R\R){x). 

However  R  e  Funct,  so  that  by  Thm.X.5.5,  Cor.  3,  x  e  Val(^),  D  . 
X  =  iR\R)(x).    As  Val(/^)  =  Arg(R),  we  get  x  =  {R\R){x). 

Proof  of  Part  II.     Similar. 

Theorem  X.5.21.  h  {R,S):R,S  e  Funct. Arg(i?)  n  ArgGS)  =  A.  D  . 
R  U  S  e  Funct. 

Proof.  Assume  the  hypothesis.  Let  x{R  W  S)y.x{R  W  ^O^-  Then 
xRy.xRz.  y.xRy.xSz.  y.xSy.xRz.  y.xSy.xSz. 

Case  1.     xRy.xRz.    Then  y  —  z. 

Case  2.  xRy.xSz.  Then  a;  e  Arg(P).a;  e  Arg(AS).  This  contradicts 
Arg(i^)  n  Arg(<S)  =  A,  and  so  by  reductio  ad  absurdum,  y  =  z. 

Case  3.     Like  Case  2. 

Case  4.     xSy.xSz.    Then  y  =  z. 

So  in  any  case  y  =  z. 

Corollary  1.  [-  {R,S):R,S  e  Funct.Val(22)  r\  Ya\{S)  =  A.  D  .Cnv(i?  W  5) 
e  Funct. 

Corollary  2.  [-  (R,S):R,S  e  l-l.Arg(i?)  n  Arg(^)  =  A.Val(P)  n  Val(5)  - 
A.  D  .RyJ  S  e  1-1. 

Theorem  X.5.22.     \-  (a,R):R  e  Funct.  D  .(/^|i2)"a  -  «  n  Val(72). 

Proo/.     Assume  R  e  Funct.    Then  by  Thm.X.5.5,  Cor.  1, 

y  e  (R\Ry'a.  =  .{¥.x) .x{R\R)y .x  e  a 

.  =  .(Ex). a;  ==  y.x  e  Y8i\{R).x  e  a 
.  =  .7/  €  Val(P).?/  e  a. 

Corollary  1.     \-  ia,R):R  e  Funct.  D  .R"{Pv'a)  =  a  r\  Val(P). 

Corollary  2.  |-  {a,0,R):R  e  Funct. ^"«  =  ^"/3.  D  .a  H  Val(P)  = 
/3nVal(i2). 

Corollary  3.  h  («,/3,^):^  e  1-1. P"a  =  P"/3.  D  .a  n  Arg(P)  =  ^  n 
Arg(P). 

Suppose  A  is  a  term  containing  no  free  variables  other  than  x.  Then  A 
is  a  function  value  of  x.  The  corresponding  function  is  xy{y  =  A),  where 
y  is  a  variable  not  occurring  in  A.  Because  of  the  importance  of  this  func- 
tion, we  introduce  a  shorter  notation  for  it,  namely, 

\x{A)        for        xy{y  =  A) 

where  ?/  is  a  variable  not  occurring  in  A .  Other  than  our  specification  on  y, 
we  make  no  specification  about  occurrences  of  variables  in  A.  Thus, 
though  the  common  usage  of  \x(A)  is  for  A  which  contains  free  occurrences 
of  X  and  of  x  only,  we  are  permitted  to  use  the  notation  for  A  with  no  free 
variables,  so  as  to  get  a  constant  function,  or  for  A  with  additional  free 
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variables,  in  which  case  we  get  a  function  involving  parameters,  namely, 
the  free  variables  other  than  x  which  have  occurrences  in  A.  For  stratifi- 
cation, it  is  necessary  and  sufficient  that  x  =  A  he  stratified,  and  if  \x(A) 
is  stratified,  it  has  type  one  higher  than  x  (or  A).  All  occurrences  of  x  are 
bound  in  \x(A),  but  any  other  free  occurrences  of  variables  in  A  are  free 
occurrences  in  'Kx(A)  and  constitute  all  the  free  occurrences  in  'Xx{A). 

Clearly  Xx{x^)  is  the  function  "square  of,"  Xa;(4x)  is  the  function  "four 
times,"  etc.    Consequently, 

i)(Xx(a;'))  =  \x(2x), 

i)(Xa;(0)   =  Xx(e'), 
etc. 

Theorem  X.5.23.     li  x  =  A  is  stratified,  then: 
*I.  h  {x,y):x(\x(A))y.  =  .y  =  A. 
*II.  |-  \x(A)  €  Funct. 
III.  \-  {x).x(\x{A))A. 
*IV.  h  ATg(\x(A))  -  V. 
V.  h  ix).0<x(A)){x)  =  A.  ' 

Yl.  \-  (w).Qkx{A)){w)  =  {Sub  in  Aiw  for  x],  provided  that  the  substitu- 
tion indicated  by  { Sub  in  A:  w  for  x }  causes  no  confusion. 
Proof.  Part  I  follows  by  Thm.X.3.7.  Then  Part  II  follows  by  Thm. 
X.5.1.  From  Part  I,  we  get  Part  III  by  replacing  y  by  A.  Then  by  Part 
III,  we  get  |-  {x).x  e  Arg(Xa;(A)),  so  that  Part  IV  follows.  Now  Part  V 
follows  bj^  Thm.X.5.3  from  Parts  II,  IV,  and  III.  If  we  replace  x  by  w 
in  Part  V,  then  we  get  Part  VI. 

Notice  that,  in  this  theorem,  it  is  permitted  that  A  contain  other  free 
variables  besides  x.    Thus  by  Part  VI  we  would  have 

\-  (\x(x  +  y))(w)  =  w  +  y, 
\-  (Xx{x  +  y))(y)  =  y  +  y, 
etc. 

However,  \x{x  -\-  y)  is  not  a  function.  It  is  a  function  value  of  y.  If 
one  gives  y  a  particular  value,  such  as  3,  then  one  gets  a  function,  namely, 
\x{x  +  3),  the  function  "plus  3."    Nevertheless, 

\-  {\x{x  +  y)){x)  =  X  +  y, 

so  that  \x{x  -|-  y)  has  the  behavior  characteristic  of  a  function  (of  x).  The 
classical  mathematical  phrase  for  this  situation  is  to  say  that  \x{x  +  ?/)  is  a 
function  dependent  on  the  parameter  y. 
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Theorem  X.5.24.  If  the  variable  x  occurs  free  in  A  and  the  variable  a 
does  not  occur  in  A ,  and  a;  =  ^  is  stratified,  then 

h  (a):(\xiA)y'a  -    {A\x  ea]. 

Proof. 

\-y  e  {'\x{A))"a:  =  :(Ex).x{\x(A))y.x  e  a-. 

=  :(Ex).y  —  A.x  e  a: 

=  :y  e  {A  \  x  e  a\. 

Theorem  X.5.25.     |-  iR):R  e  Funct.  D  .Arg(R)](Xx(R(x)))  =  R. 

Proof.  Assume  R  e  Funct.  Let  x(ATg(R)](\xiR{x))))y.  Then 
X  eArg(R).x(\x(R(x)))y.  So xR(R(x))  and y  =  R{x).  SoxRy.  Conversely, 
assume  xRy.  Then  x  e  AYg(R).  So  y  =  R(x).  So  x(Xx(R{x)))y.  So 
x(Arg{R)UXx{R{xmy. 

Theorem  X.5.26.  [-  {a,l3,R):.f3  =  C\os{(x, xRz).R  e  Funct:  D  :(.r): 
X  e(3  n  Arg{R).  D  .R{x)  e  /?. 

Proof.  Assume  /3  =  C\os(a,xRz),  R  e  Funct,  x  e  ^,  x  e  Arg(R).  Then 
a;  e  ^.xRiR{x)).    So  72(a;)  e  J?"/3.    So  by  Thm.X.4.35,  R(x)  e  0. 

Theorem  X.5.27.  li  x  =  A  is  stratified,  and  z  has  no  free  occurrences  in 
A,  then  \-  (a,l3):.l3  =  C\os(a,x(\x(A))z):  D  ■.(x).x  e  0.  D  .A  e  /3. 

A  function  of  two  variables  has  to  be  a  class  of  ordered  triples.  We  recall 
that  the  ordered  triple  (x,y,z)  is  {{x,y),z).  Then,  since  {x,y)Rz  means 
{(x,y),z)  e  R,  we  have 

\-  {x,y,z)  e  R.  =  .{x,y)Rz. 

We  say  that  i^  is  a  function  of  the  two  variables  x  and  y  if 

(x,y,u,v):(x,y)Ru.(x,y)Rv.  D  .u  =  v. 
We  write 

R{x,y)         for         iz  {{x,y)Rz). 

This  makes  R(x,y)  identical  with  R{{x,y)).  Also,  given  a  term  A,  we 
write 

\xy{A)         for         {{x,y,z)  \  z  =  A.x  =  x.y  —  y\. 

Then  we  have : 

Theorem  X.5.28.     li  x  =  y  =  A'ls  stratified,  then: 

*I.  \-  (x,y,z):(x,y)(\xy(A))z.  =  .z  =  A. 

II.  [-  (x,y,u,v):(x,y)(Xxy(A))u.{x,y)(\xy(A))v.  D  .u  =  v. 

III.  [-  \xy(A)  e  Funct. 

IV.  h  {x,y).{x,y){\xy(A))A. 
V.  h  Arg{\xijiA))  =  V  X  V. 

VI.  h  (x,'y).i\xy(A))(x,y)  =  A. 

VII.  \-  {u,v).(Xxy(A)){u,v)  —  {Sub  in  A:  u  for  x,  v  for  y},  provided  that 
the  substitutions  indicated  by  {Sub  in  A:  u  for  x,  v  for  y}  cause 
no  confusion. 
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We  can  proceed  quite  analogously  to  functions  of  n  variables. 

Let  us  now  consider  the  question  of  how  we  handle  relations  and  func- 
tions when  dealing  with  variables  of  restricted  range.  If  x,y,z  are  restricted 
to  S,  then  x,y,z  would  be  restricted  to  S  X  2,  and  the  corresponding  R,S,T 
would  be  restricted  to  SC(2  X  2).  Thus  Rel  would  be  interpreted  as 
SC(2  X  2).  These  reinterpretations  would  not  affect  those  theorems  of 
the  present  chapter  which  are  stratified.  Theorems  dealing  with  USCCa) 
would  not  hold  in  general. 

Of  rather  more  interest  is  the  case  where  x  is  restricted  to  2i  and  y  to  22. 
Then  the  corresponding  relation  xyP  would  be  restricted  to  2i  X  22,  and 
its  converse  to  22  X  2i.  It  does  not  seem  profitable  to  try  to  record  pre- 
cisely the  changes  that  must  be  made  in  the  theorems  of  this  chapter  to 
accord  with  such  restrictions  on  x  and  y,  since  these  are  usually  applied 
only  in  special  cases,  in  which  one  can  just  as  easily  replace  relations  R  by 
2i1/2['22  and  their  converses  by  221^^2,.  In  other  words,  comphcated 
cases  can  best  be  handled  by  going  to  relations  of  the  form  a\R\^.  The 
most  common  situation  of  this  sort  is  that  in  which  we  are  considering 
transformations  from  2i  to  22  and  vice  versa.  It  appears  that  the  device 
of  using  restricted  variables  in  such  case  is  of  little  value,  since  2i  and  2,  do 
not  stay  fixed  long  enough,  and  the  best  procedure  is  to  use  unrestricted 
variables  and  state  the  necessary  restrictions  explicitly.  Detailed  treat- 
ments of  two  special  cases  of  this  will  occur  in  Sec.  1  of  Chapter  XI  and  Sec. 
1  of  Chapter  XII. 

EXERCISES 

A  relation  R  is  called  a  transformation  of  a  into  |8  if  jR  e  Funct,  Arg(i2)  = 
a,  and  Ya\{R)  C  /3. 

A  relation  R  is  called  a  transformation  of  a  onto  ^\iR  e  Funct,  Axg{R)  = 
a,  and  Val(i^)  =  /3. 

A  subset  iS  of  a  is  said  to  be  invariant  under  a  transformation  R  if 
R"0  =  0. 

A  set  of  objects  y  is  called  a  semigroup  with  respect  to  an  operator  o  if: 

1.  {x,y):x,y  6  7.  D  .x  oy  ey. 

2.  (x,y,z):x,y,z  e  y.  D  .x  o  (y  o  z)  =  (x  o  y)  o  z. 

3.  (Ee):.e  e  y:(x):x  ey.  D.eox  =  xoe  =  x. 
We  say  that  7  is  a  group  if  in  addition : 

4.  (e)::e  e  y:{x):X  ey.  D  .e  o  x  =  X  o  e  =  X.:  D  :.(y):.y  e  y.  D  •.(Ez).Z  e  7. 
yoz  =  zoy  =  e. 

X.5.1.  If  a  is  a  given  set,  and  7  is  the  set  of  all  transformations  of  a 
into  a,  prove  that  7  is  a  semigroup  with  respect  to  the  operator  |.  (Hint. 
Take  e  to  be  a17.) 

X.5.2.     If  a  is  a  given  set,  and  jS  is  a  given  subset  of  a,  and  7  is  the  set 
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of  all  transformations  of  a  into  a  under  which  /3  is  invariant,  prove  that  7 
is  a  semigroup  with  respect  to  the  operator  |. 

X.5.3.  If  a  is  a  given  set,  and  /3  is  a  given  subset  of  a,  and  7  is  the  set 
of  all  transformations  of  a  onto  a  under  which  /3  is  invariant,  prove  that  7 
is  a  semigroup  with  respect  to  the  operator  | .  Discuss  the  cases  j8  =  A  and 
13  =  a. 

X.5.4.     If  7  is  a  semigroup  with  respect  to  the  operator  o^  prove  that 

(Eie):e  e  y:(x):X  e  y.  D  .eox  =  xoe  =  x. 

X.5.5.  If  a  is  a  given  set,  and  /3  is  a  given  subset  of  a,  and  7  is  the  set 
of  all  relations  R  such  that  both  R  and  R  are  transformations  of  a  onto  a 
and  13  is  invariant  under  both  R  and  R,  then  prove  that  7  is  a  group  with 
respect  to  the  operator  |.    (Hint.  Take  R  to  be  the  inverse  of  R.) 

X.5.6.  Prove  that,  ii  x  =  A  is  stratified,  then  [-  {a,l3):{'\x{A)y'a  C  ^. 
=  .(x).  X  e  a  D  A  €  ^. 

X.5.7.     Prove  h  /  =  M^)- 

X.5.8.     Prove  |-  {R,S):R,S  e  Funct.  D  .R  n  S  e  Funct. 

X.5.9.  Bohr  (see  Bohr,  1947,  page  39)  defines  the  mean  value  of  a 
"function''  f{x)  as 

1  r"" 

lim  ^  /     fix)  dx 

T-tm    J-     Jo 

and  denotes  this  hy  M{f(x)}.  Explain  why  M(f)  would  be  a  better  nota- 
tion. On  page  48,  Bohr  writes  a{k)  for  M{f{x)e~''"'}.  Explain  why  a(f,k) 
would  be  a  better  notation  than  a(k)  and  show  how  to  write  the  definition 
of  a(f,k)  in  terms  of  M(     )  if  one  is  writing  M(f)  for  the  mean  value  of  /. 

X.5.10.     Explain  why  one  cannot  prove  |-  (/).(X/(/(0)))(/)  =  /(O). 

X.5.11.  Explain  why  one  cannot  prove  p  {x,y):{{\x(Xy(x  +  y))){x))(y) 
=  x  -{•  y. 

X.5.12.     Prove: 

(a)  \-  A  e  Funct. 

(b)  h  A  e  1-1. 

X.5.13.     If  R^  is  defined  as  in  Ex.X.4.10,  prove: 

(a)  h  iR).R\R  Q  R^. 

(b)  h  (R).iR\R)  W  (R\R^)  =  R^. 

X.5.14.  If  <  is  defined  as  in  Ex.X.1.4.  and  R^  is  defined  as  in  Ex.X.4.10, 
prove  h  Nnl(#(a:  <  y))fNn  =  (Nn1(Xa:(a;  +  l))fNn)^. 

X.5.15.     Prove  \-  (a). /"a  =  a. 

X.5.16.     Prove  \-  (/3,a:).(/3  X  {x})  €  Funct. 

X.5.17.  Prove  h  (R,S,T)::R,S  e  1-1:7^  =  >S|i2|Cnv(RUSC(5)):.(.r,y): 
xRy.  D  .y  =  {x}.:  D  -..T  e  l-l:{x, yy.xTy.  D  .y  =  [x]. 
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6.  Ordered  Sets.  Let  us  have  a  set  a.  It  is  ordered  if  there  is  an  order- 
ing relation  R  such  that,  for  various  distinct  members  x  and  y  of  a,  xRy 
holds  but  yRx  does  not  hold.  Then,  by  means  of  the  relation  R,  we  could 
decide  for  the  x  and  y  in  question  to  put  them  in  the  order  x,y  rather  than 
the  order  y,x  because  xRy  holds  but  yRx  does  not  hold. 

It  is  not  generally  required  that  for  each  x  and  y  at  least  one  of  xRy  or 
yRx  must  hold.  This  property  is  usually  present  only  for  special  ordering 
relations  which  induce  what  is  called  a  simple  ordering  of  the  set  a. 

For  a  set  of  real  numbers,  <,  <,  >,  etc.,  would  be  ordering  relations. 
For  a  set  of  integers,  the  relation  of  "being  a  factor  of"  would  be  an  ordering 
relation.    For  a  set  of  classes,  the  relation  C  would  be  an  ordering  relation. 

Whitehead  and  Russell  consider  <  between  real  numbers  as  the  typical 
ordering  relation.  The  current  mathematical  practice  is  to  take  <  between 
real  numbers  as  the  typical  ordering  relation.  Thus  if  R  is  an  ordering 
relation  for  the  class  a,  we  should  have 

(x):X  e  a.  D  .xRx. 

That  is,  a]I  Q  R  should  hold.  If  it  does  not,  we  can  easily  arrange  for  it 
to  hold  by  replacing  R  by  R  VJ  (a]I).  Such  a  replacement  does  not  alter 
any  ordering  of  the  members  of  a  which  was  imposed  by  R.  For  example, 
if  i?  is  <  between  real  numbers  and  a  is  the  class  of  real  numbers,  then 
a]I  Q  R  fails  to  hold.  However,  we  can  replace  Rhy  R  VJ  {a]I)  and  have 
the  same  ordering,  since  R  \J  (a]I)  is  merely  <  between  real  numbers. 
A  somewhat  similar  situation  holds  with  regard  to  the  property 

(x,y):xRy.  D  .x,y  e  a. 

This  need  not  hold  in  general.  For  instance,  R  might  be  the  relation  of  < 
between  real  numbers  and  a  might  be  the  class  of  rational  numbers.  Then 
■\/2  <  IT,  but  '^\/2  e  a  and  ^^  ir  ta.  However,  we  can  readily  make  the 
property 

(x,y):xRy.  D  .x,y  ea 

hold  by  replacing  R  by  a\R\a  (see  Thm.X.4.5,  Part  III) .  As  far  as  members 
of  a  are  concerned,  R  and  aljRfo:  would  induce  the  same  ordering.  For 
instance,  if  i2  is  <  between  real  numbers  and  a  is  the  class  of  rational 
numbers,  then  ct^i^fo;  is  just  <  between  rational  numbers. 

Thus  it  is  feasible  as  well  as  convenient  to  make  the  convention  that,  if  R 
is  to  be  considered  an  ordering  relation  for  a,  then 

{x):X  e  a.  D  .xRx 

(x,y):xRy.  D  .x,y  e  a 
should  both  hold. 
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If  R  is  an  ordering  relation  for  a,  then  from  the  two  conditions  just  given, 
we  have 

a  =  Arg{R)  =  YaliR)  =  AY{R). 

Thus  an  ordering  relation  R  for  a  will  determine  a  uniquely.  Hence, 
instead  of  dealing  with  an  ordered  set,  which  consists  of  a  set  a  with  an 
associated  ordering  relation  R,  we  can  deal  exclusively  with  the  ordering 
relation  R,  since  R  determines  a  by  the  relation  a  =  AV(^).  That  is,  all 
statements  about  ordered  sets  are  equivalent  to  statements  about  their 
ordering  relations,  and  vice  versa. 

Thus  we  do  not  introduce  the  notion  of  an  ordered  set  at  all,  but  only 
the  notion  of  an  ordering  relation.  The  set  ordered  by  an  ordering  relation 
R  is  just  AV{R). 

If  one  wished  to  introduce  the  notion  of  an  ordered  set,  one  would  have  to 
devise  a  formalism  for  the  notion  of  a  set  a  with  an  associated  ordering 
relation  R.    One  could  use  the  ordered  pair  {a,R)  for  this  purpose  and  write 

Ord  Set         for         dR{(x):X  e  a.  D  .xRx:.(x,y):xRy.  D  .x,y  e  a). 

Then  we  would  have 

|-  {a,R)  e  Ord  Set.:  =  :.(x):X  e  a.  D  .xRxi.{x,y):xRy.  D  .x,y  e  a, 

so  that  Ord  Set  would  be  the  class  of  all  ordered  sets.  However,  it  will  be 
more  economical  to  deal  merely  with  the  ordering  relations. 

We  define  a  reflexive  relation  R  as  one  such  that  xRx  for  all  x  in  AV(R). 
Thus  we  define 

Ref        for        R(x):x  e  AV(R).  D  .xRx. 

Ref  is  stratified  and  has  no  free  occurrences  of  a  variable  and  so  may  be 
assigned  any  type. 
Theorem  X.6.1. 

*I.  \-  (R):.R  €  Ref:  =  :R  e  Rel:(x-):a:  e  AY(R).  D  .xRx. 
II.  h  Ref  C  Rel. 

III.  \-  (R):.R  e  Ref:  =  :(x):x  e  AV(R).  D  .xRx. 

IV.  h  (R):R  e  Ref.  D  .Ats(R)  =  ya\{R)  -  AY(R). 
Y.  h  (I3,R):R  €  Ref.  D  .^]R\l3  e  Ref. 

VI.  I-  iR):R  e  Ref.  D  .R  e  Ref. 

VII.  h  {^,R):R  6  Ref.    D    .Arg(/3li^r/3)    =   YamR\^)   =   AY(0]Rm    = 
13  n  AY{R). 
We  define  a  transitive  relation  R  as  one  such  that  xRy.yRz.   D   .xRz. 
Thus  we  define 

Trans         for        R{x,y,z):xRy.yRz.  D  .xRz. 
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Trans  is  stratified  and  has  no  free  occurrences  of  a  variable  and  su  n.ay 
be  assigned  any  type. 
Theorem  X.6.2. 
*I.  \-  {R):.R  e  Trans:  =  :R  e  Re\:(x,y,z):xRy.yRz.  D  .xRz. 

II.  \-  Trans  Q  Rel. 

III.  h  (R):R  e  Trans.  =  .R|R  C  R. 

IV.  h  la,^,R):R  e  Trans.  D  .a]R\0  e  Trans. 
V.  h  (72)  :i^  €  Trans.  D  M  e  Trans. 

An  ordered  set  is  said  to  be  quasi-ordered  if  its  ordering  relation  is 
reflexive  and  transitive.  Accordingly  we  shall  say  that  a  relation  is  a  quasi- 
ordering  relation  if  it  is  reflexive  and  transitive.    So  we  define 

Qord         for         Ref  n  Trans. 

Theorem  X.6.3. 

I.  \-(R):R  e  Qord.  =  .R  e  Ref.R  e  Trans. 
II.  h  Qord  C  Rel. 

III.  \-  {P,R):R  e  Qord.  D  .I3]R\^  e  Qord. 

IV.  h  {R):R  e  Qord.  D  .R  e  Qord. 

As  an  example  of  a  quasi-ordering  relation,  consider  the  relation  R  which 
holds  between  two  complex  numbers  z  and  w  when  and  only  when  the  real 
part  of  z  is  less  than  or  equal  to  the  real  part  of  w. 

Suppose  R  is  an  ordering  relation  of  a  and  /3  is  any  subset  of  a.  Then 
fi\R\fi  is  an  ordering  relation  of  /3  which  imposes  the  same  ordering  on  the 
elements  of  /J  which  is  imposed  by  R.    This  is  expressed  in  words  as: 

"Any  subset  of  an  ordered  set  is  itself  an  ordered  set  relative  to  the  same 
ordering  relation." 

Part  III  of  Thm.X.6.3  gives  a  special  case  of  this,  namely,  that  if  a  is  a 
quasi-ordered  set,  then  any  subset  of  a  is  a  quasi-ordered  set  relative  to  the 
same  ordering  relation. 

Part  IV  of  Thm.X.6.3  says  that  a  quasi-ordered  set  is  still  quasi-ordered 
if  one  reverses  the  order. 

We  define  an  antisymmetric  relation  R  as  one  such  that  xRy.yRx.  D  . 
X  =  y.    Thus  we  define 

Antisym        for        R{x,y):xRy.yRx.  Z>  .x  —  y. 

Antisym  is  stratified  and  has  no  free  occurrences  of  a  variable  and  so 
may  be  assigned  any  type. 
Theorem  X.6.4. 
I.  \-  {R):.R  e  Antisym:  =  :R  e  Re\:{x,y):xRy.yRx.  D  .x  =  y. 
II.  |-  Antisym  Q  Rel. 

III.  \-  {a,l3,R):R  e  Antisym.  D  .a]R\(3  e  Antisym. 
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IV.  |-  (R):R  e  Antisym.  D  .ft  e  Antisym. 

*V.  |-  (R,x,y):R  e  Antisym.a:(/2  -  I)y.  D  .^{yRx). 

VI.  |-  (R,x,y,z):R  e  Antisyra. xRy.y(R  —  I)z.  D  .x  9^  z. 
VII.  |-  (R,x,y,z):R  e  Antisym.a;(jR  —  I)y.yRz.  D  .x  9^  z. 

Proof  of  Part  V.  Assume  R  e  Antisym.  Then  by  Part  I,  xRy.yRx.  D  . 
X  =  y.    So  xRy.x  9^  y.  D  .'^{yRx). 

Proof  of  Part  VI.  Assume  R  e  Antisym.  Then  by  Part  T,  xRy.yRz. 
x  =  z.  D  .y  =  z.    So  xRy.yRz.y  7^  z.  D  .x  9^  z. 

When  i?  is  a  relation  hke  < ,  then  R  —  I  corresponds  to  < .  Then  Part  V 
above  is  just  the  familiar  statement 

X  <  y.  D  .'^(y  <  x). 

An  ordered  set  is  said  to  be  partially  ordered  if  its  ordering  relation  is 
reflexive,  transitive,  and  antisymmetric.  Accordingly,  we  shall  say  that  a 
relation  is  a  partial  ordering  relation  if  it  is  reflexive,  transitive,  and  anti- 
symmetric.   So  we  define 

Pord  =  Ref  r\  Trans  n  Antisym. 

Theorem  X.6.5. 

I.  y{R):R€  Pord:  =  :R  e  Ref.i^  e  Trans.72  e  Antisym. 
II.  |-  Pord  =  Qord  r\  Antisym. 

III.  h  Pord  C  Rel. 

IV.  [-  {^,R):R  e  Pord.  D  .^]R\^  e  Pord. 

V.  \-  {R):R  e  Pord.  J  .R  e  Pord. 

VI.  |-  iR,x,y,z):R  e  Ford.xRy.y(R  -  I)z.  D  .x(R  -  I)z. 
VII.  h  {R,x,y,z):R  e  FoTd.x{R  -  I)y.yRz.  D  .x(R  -  I)z. 
VIII.  h  (R,x,y,z):R  e  Ford.xiR  -  I)y.y{R  -  I)z.  D  .x{R  -  I)z. 

We  may  interpret  Part  IV  as  saying  that  any  subset  cf  a  partially  ordered 
set  is  partially  ordered  relative  to  the  same  ordering  relation. 

We  may  interpret  Part  V  as  saying  that  any  partially  ordered  set  is  still 
partially  ordered  if  we  reverse  the  order. 

If  i?  is  <,  then  we  may  interpret  Parts  VI,  VII,  and  VIII  as  saying 

X  <  y.y  <  z.  D  .x  <  z 
^  <  y.y  <  z.  D  .X  <  z 

X  <  y.y  <  z.  D  .X  <  z 

respectively. 

Examples  of  partial  ordering  relations  are  the  relation  of  "being  a  factor 
of"  among  positive  integers  (we  recall  that  any  positive  integer  is  a  factor 
of  itself),  or  the  relation  C  among  subsets  of  any  given  set  2,  or  the  relation 
R  which  holds  between  any  two  complex  numbers  z  and  w  when  and  only 


334  LOGIC  FOR  MATHEMATICIANS  [Chap.  X 

when  the  real  part  of  z  is  less  than  or  equal  to  the  real  part  of  w  and  the 
imaginary  parts  of  z  and  w  are  equal. 

X  ''least"  member  of  /3  relative  to  R  (if  there  is  one)  is  an  x  such  that 
X  e  13  n  A.Y(R),  and  (y):y  e  /3  n  AY(R).  D  .xRy.  A  "minimal"  member 
of  /3  relative  to  R  (if  there  is  one)  is  an  x  such  that  x  e  ^  r\  AN{R),  and 
{y):y  e  /3  n  kY{R).  D  .^{y{R  -  I)x). 

Accordingly  we  define 

.xRy, 


x  least  fl  /3 

for 

.T  e  /3  n  AV(R):(2j):y  e  /3  H  AV(i?). 

D 

X  miui;  (S 

for 

X  e  /3  n  AV(R):iy):y  e  /3  H  AV(i2). 
~(2/(/^  -  I)x), 

D 

X  greatest/e  /3 

for 

X  leastcnvcfl)  /3, 

X  max«  (S 

for 

X  minc„v(ij)  iS, 

where  ?/  is  a  variable  not  occurring  at  all  in  x,  R,  or  /3. 

For  stratification  of  x  leasts  /3,  .r  min^j  ,8,  a:  greatest ij  /3,  and  x  max^  /3,  the 
types  of  R  and  /3  should  be  one  higher  than  the  tj^pe  of  x.  The  free  occur- 
rences of  variables  are  those  in  x,  R,  or  /3. 

Theorem  X.6.6. 
I.  |-  {^,R,x):R  e  Antisym.a;  leasts  /3.  D  .;r  miujj  /3. 
II.  |-  (^,R,x):R  €  Antisym.a;  greatest/j  ^.  D  .x  max^  /3. 

Theorem  X.6.7. 
I.  |-  {0,R,x,y):R  e  Antisym.a;  leasts  ^.y  least ij  ^.  D  .x  =  y. 
II.  |-  {^,R,x,y):R  e  Antisym.x  greatest ij  I3.y  greatest^  /?.  D  .x  =  ?/. 

Thus,  for  an  antisymmetric  relation  a  "least"  member  of  a  set  is  unique, 
so  that  one  should  speak  of  "the  least"  member  of  a  set.  On  the  other 
hand,  a  minimal  member  may  well  fail  to  be  unique,  so  that  one  must 
speak  of  "a  minimal"  member  rather  than  "the  minimal"  member. 

Theorem  X.6.8.  \-  (l3,y,R,S,x):.R  e  Ref.S  =  7]R\y:  D  :x  leasts  {13  n  7). 
=  .X  leasts  |8. 

Proof.  Assume  R  e  Ref  and  S  =  y]R\y.  Then  by  Thm.X.6.1,  Part  VII, 
AV(>S)  =  7  n  AY{R).    So 

(1)  /3  n  AV(^)  =  (/?  n  7)  n  AY(R). 

Now 

h  X  e  (/3  n  7)  ^  AY (R):(7j):y  e  (13  r\  y)  r\  AY(R).  D  .xRy-.. 

=  :.x  e((3  r\y)  r\  AY(R):(y):y  e  (/5  n  7)  n  AY(R).  D  .x  e  y.xRy.y  e  7:. 
^  :.x  e  (13  n  y)  r^  AY (R):(y):y  e  ((3  r^  y)  r\  AY(R).  D  .xSy. 

Soby  (1), 

X  leastfl  (/5  n  y).  =  .x  leasts  ^. 

We  define  a  connected  relation  R  as  one  such  that  if  x,  y  e  AY(R),  then 
either  xRy  or  yRx.    So  we  define 


Sec.  6]  RELATIONS  AND  FUNCTIONS  335 

Connex         for        R{x,y):x,y  e  AV(R).  D  .xRt/wijRx. 

Connex  is  stratified  and  has  no  free  occurrences  of  a  variable  and  so  may 
be  assigned  any  type. 
Theorem  X.6.9. 

I.  |-  {R):.R  e  Connex:  =  :72  e  'Re\:{x,y):x,y  e  AY(R).  D  .xRyyyRx. 
II.  \-  Connex  C  Rel. 

III.  h  (I3,R):R  e  Connex.  D  .^]R\^  e  Connex. 

IV.  h  {R):R  e  Connex.  D  .R  e  Connex. 
V.  \-  Connex  C  Ref. 

Proof  of  Part  V.  Suppose  R  e  Connex.  Then  let  x  e  AV(72).  Then  by 
Part  I,  xRxyxRx. 

Note  that  many  familiar  ordering  relations  such  as  ^  are  not  connected. 
Theorem  X.6.10. 

I.  \-  ((3,R,x):.R  e  Antisym  r\  Connex:  D  :x  leasts  (3.  =  .x  min/j  /3. 

II.  \-  (^,R,x):.R  e  Antisym  r\  Connex:  D  :x  greatestjj  /3.  =  .x  max^  /3. 
Proo/  0/  Pari  I.     Let  R  e  Antisym  n  Connex.     Then  by  Thm.X.6.6, 

Part  I, 

(1)  X  least/j  |8.  D  .x  min^  /3. 

Now  let  X  mini,  ^.  Then  x  e  /3  n  AV(R).  So  .TP.r  by  Thm.X.6.9,  Part  V, 
and  Thm.X.6.1,  Part  I.  Now  assume  y  e  0  n  Ay(R).  Then  xRywyRx  by 
Thm.X.6.9,  Part  I.    That  is 

(2)  ^(yRx)  D  xRy. 

Case  1.     y  =  X.    Then  xRy  follows  from  xRx. 

Case  2.  y  ^  x.  Then  ^(yRx)  follows  from  the  definition  of  x  min^j  (3. 
Then  xRy  follows  from  (2). 

Proof  of  Part  II.     Similar. 

An  ordered  set  is  said  to  be  simply  ordered  if  its  ordering  relation  is 
reflexive,  transitive,  antisymmetric,  and  connected.  A  simply  ordered  set 
is  sometimes  called  a  chain.  Accordingly,  we  shall  say  that  a  relation  is  a 
simple  ordering  relation  or  a  chain  relation  if  it  is  reflexive,  transitive,  anti- 
symmetric, and  connected.    So  we  define 

Sord         for         Ref  r\  Trans  Pi  Antisym  n  Connex. 

Theorem  X.6.11. 

I.  j-  {R):R  e  Sord:  =  :R  e  Ref.P  e  Trans.P  e  Antisym./?  e  Connex. 

II.  \-  Sord  =  Pord  Pi  Connex. 

III.  h  Sord  C  Rel. 

IV.  h  il3,R):R  e  Sord.  D  .I3]R\^  e  Sord. 
V.  h  (R):R  e  Sord.  D  M  e  Sord. 

Thus  any  subset  of  a  simply  ordered  set  is  itself  a  simply  ordered  set 
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relative  to  the  same  ordering  relation.  Likewise,  reversing  the  order  leaves 
the  set  simply  ordered.  Moreover,  by  Thm.X.6.10,  in  a  simply  ordered 
set,  the  notions  of  least  and  minimal  coincide,  and  the  notions  of  greatest 
and  maximal  coincide. 

The  notions  of  <  between  real  numbers,  or  between  rational  numbers, 
or  between  positive  integers,  etc.,  are  all  especially  typical  examples  of 
simple  ordering  relations.  A  less  familiar  example  would  be  the  relation 
R  which  holds  between  two  complex  numbers  z  and  w  when  either  the  imagi- 
nary part  of  z  is  less  than  the  imaginary  part  of  w  or  else  the  imaginary 
parts  of  z  and  w  are  equal  and  the  real  part  of  z  is  less  than  or  equal  to  the 
real  part  of  w. 

Given  any  simply  ordered  set  a  and  any  element  x  of  a,  we  define  the 
segment  of  a  determined  by  x  to  be  the  set  of  all  elements  which  "precede" 
X,  that  is,  the  set  of  all  elements  y  such  that  y(R  —  I)x.  Since  this  segment 
is  a  subset  of  a,  it  is  also  a  simply  ordered  set.  If  R  is  the  ordering  relation 
of  a,  we  denote  by  segM  the  ordering  relation  of  the  segment  of  a  deter- 
mined by  X.    So  we  define 

seg^R        for        {y(y(R  -  I)x))]R\(y(y{R  -  I)x)), 

where  y  is  a  variable  not  occurring  in  R  or  x. 

For  stratification  of  seg^^R,  R  and  seg^^R  have  to  be  one  type  higher  than  x. 
Any  free  occurrences  of  variables  in  segx/2  will  be  those  in  x  or  R. 

By  Thm.X.6.1,  Part  V,  Thm.X.6.2,  Part  IV,  Thm.X.6.4,  Part  HI,  and 
Thm.X.6.9,  Part  III,  we  see  that  seg^R  will  have  any  of  the  properties  Ref. 
Trans,  Antisym,  and  Connex  that  R  has. 

Theorem  X.6.12. 
I.  \-  {R,x,y,z):y(seg,R)z.  =  .y(R  -  I)x.yRz.z(R  -  I)x. 

II.  \-  (72,a:,?/,z) :. J?  eTrans.i^  e  Antisym:  D  :y(segM)z.  =  .yRz.z{R  —  I)x. 
*III.  h  {R,x):R  €  Ref.  D  .AV(seg.i?)  =  y{y{R  -  I)x). 

Proof  of  Part  II.     Use  Thm.X.6.2,  Part  I,  and  Thm.X.6.4,  Part  VI. 

Proof  of  Part  III.  One  easily  gets  \-  AY(seg,R)  Q  y{y{R  —  I)^)-  Now 
let  R  e  Ref  and  y{R  -  I)x.  Then  y{R  -  I)x.yRy.y(R  -  I)x.  So  by 
Part  I,  y{seg,R)y.    So  y  e  AV(segx72). 

Theorem  X.6.13.  [-  {R,x,y):R  e  Trans.i?  e  Kni\Bym.y{R  -  I)x,  D  . 
seg„(seg,i2)  =  segyR. 

Proof.     Assume  the  hypothesis.    Temporarily  put 

A  =  y{y{R  -  I)x), 

S  =  seg.i^  =  A]R\A, 

B  =  z(ziS  -  I)y), 

C  =  z{z{R  -  I)y). 
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Then 

seg„(seg^/2)  =  seg^S  =  B]S\B 
=  B](A]R\A)\B 
=  (A  r\  B)]R\iA  n  B), 

and 

seg^R  =  C]R\C. 

Thus  it  suffices  to  prove  A  n  B  =  C. 

First  \et  z  e  A  r\  B.    So  z  e  B.    So  zSy.z  7^  y.    So  zRy.z  -^  y.    So  2  e  C. 

Conversely,  let  z  e  C.  So  27^?/.0  5^  y.  However,  by  hypothesis,  yRx. 
y  ^  X.  So  since  R  e  Trans,  zRx.  Also  since  72  e  Antisym,  2  5^  re.  So 
2  €  A.  Also  y  €  A.  Thus  2  e  A.zRy.y  e  A.  That  is,  z(A]R\A)y.  That  is, 
2;>S?/.    So  2  e  5. 

♦Corollary.     \-  (R,S,x,y):R  e  Ford.S  =  seg^R.y  e  Arg(^).   D   .seg.S  = 
seg^R. 

This  theorem  says  in  effect  that,  if  y  is  in  the  segment  of  a  determined  by 
x,  then  the  segment  of  that  segment  determined  by  y  is  the  same  as  the 
segment  of  a  determined  by  y. 

We  say  that  an  ordered  set  a  satisfies  the  "ascending  chain  condition" 
(descending  chain  condition)  if  and  only  if  each  nonempty  subset  of  a  has  a 
maximal  (minimal)  element.  We  shall  be  interested  only  in  the  special  case 
where  we  have  a  simply  ordered  set  satisfying  a  descending  chain  condition. 
Such  a  set  is  called  "well  ordered."  As  always,  we  shall  deal  only  with  the 
ordering  relation  of  the  set,  which  shall  be  called  a  well-ordering  relation. 
So  we  define 

Word         for        R{R  e  Sord: (i3):,i3  r\  AY(R)  ^  A.  D  .(E?/).^/  min«  /3). 

Word  is  stratified  and  has  no  free  occurrences  of  a  variable  and  so  may 
be  assigned  any  type. 
Theorem  X.6.14. 

I.  I-  (i?):./2  e  Word:  =  xR  e  Sord:(^):/3  n  AV(/?)  ?^  A.  D  .{^y).y  min«  /S. 
*II.  [-  {R)x.R  e  Word:  =  :R  e  Sord:(i3):/3  n  AV{R)  ^  A.  D  .iEy).y  leasts  ,5. 

III.  h  (.R):.R  e  Word:  =  :R  e  Sord:{l3):8  r\  AY(R)  ^  A.  D  .(E^?/).?/  leasts  ^. 

IV.  [-  {n,R)-Jt  e  Word.  3  .y]R\y  e  Word. 
V.  \-  {R,x):R  e  Word.  D  .seg,72  e  Word. 
Proof  of  Part  II.     Use  Thm.X.6.10. 
Proof  of  Part  III.     Use  Thm.X.6.7. 

Proof  of  Part  IV.  Assume  R  e  Word  and  put  temporarily  S  =  y]R\y. 
Then  S  e  Sord.    Also  by  Thm.X.6.1,  Part  VII,  AY{S)  -  7  n  AV(J?).    So 

(1)  iS  n  AV(*S)  =  (iS  n  7)  n  AV(i2). 
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Also  by  Thm.X.6.8, 
(2)  y  leasts  /3.  =  .y  least^j  (/3  r\  y). 

Now  by  Part  II, 

(/3  n  7)  n  AY(R)  ^  A.  D  .(Ey).y  leasts  (/3  n  7). 

So  by  (1)  and  (2), 

iS  r\  AV(aS)  5^  a.  D  .(E?/).?/  leasts  13. 

We  say  that  a:  is  an  upper  bound  of  y  and  z  with  respect  to  R  if  yRx.zRx. 
We  say  that  a:  is  a  least  upper  bound  of  y  and  ^  if  a;  is  an  upper  bound  and  if 
xRw  for  every  upper  bound  w.    In  symbols 

x\uhjiy,z        for         yRx.zRx:{w):yRw.zR'W.  D  .xRw. 

We  say  that  a;  is  a  greatest  lower  bound  (gib)  of  y  and  z  with  respect  to  R 
if  it  is  a  lub  with  respect  to  R. 

If  R  is  the  relation  of  "being  a  factor  of"  among  positive  integers,  then 
the  lub  of  y  and  z  with  respect  to  R  is  their  least  common  multiple,  and  the 
gib  of  y  and  z  with  respect  to  R  is  their  greatest  common  factor. 

If  R  is  the  relation  Q  among  subsets  of  a  given  set  S,  then  by  Thm. 
IX. 4. 18,  a  W  /3  is  the  lub  of  a  and  (3  and  a  r\  ^  is  the  gib  of  a  and  ^3. 

A  lattice  is  a  partially  ordered  set  such  that  each  pair  of  elements  y  and  z 
of  the  set  has  a  lub  and  a  gib. 

In  symbols  we  write 

Lattice 
for 

R(R  e  Ford::{y,z)::y,z  e  AV (/?).:  D  ■..(Ex):yRx.zRx:(w):yRw.zRiv.  D  .xRw:. 
(Ex):xRy.xRz:(w).ivRy.wRz.  D  .wRx). 

As  elsewhere  in  this  section,  we  are  dealing  with  the  ordering  relation 
instead  of  with  the  class  which  it  ordered.  So  our  definition  of  "Lattice" 
makes  it  contain  all  ordering  relations  of  lattices.  As  examples  of  such 
relations,  we  cite  the  relation  of  "being  a  factor  of"  among  positive  integers 
and  the  relation  of  C  among  subsets  of  a  given  set  2. 

One  may  find  an  extensive  treatment  of  lattices  in  Birkhoff,  1948,  from 
which  most  of  the  material  of  this  section  was  taken. 

Theorem  X.6.15. 

I.  h  (R):R  e  Ref.  =  .RUSC(R)  e  Ref. 

II.  \-  (R):R  e  Trans.  =  .RUSC(R)  e  Trans. 

III.  \-  (R):R  e  Antisym.  =  .RUSC(R)  e  Antisym. 

IV.  h  (R):R  e  Connex.  =  .RUSC(R)  e  Connex. 
V.  h  ^R):R  e  Word.  ^  .RUSC(R)  e  Word. 

VI.  h  W,R,x):x]esistE^.  =  .{xlleastRusccij)  USC(^). 
VII.  h  (R,x).n\JSC(segM)  =  segui  RUSC(E). 
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EXERCISES 
X.6.1.     Prove: 

(a)  h  (R):R  e  Antisym.  =  .R  n  R  C  /. 

(b)  h  (cc,^,R)Mc^]RMP  =  («  n  ^)l/?f(«  n  /3). 

X.6.2.  Let  ^  be  a  fixed  point  and  let  R  be  the  relation  which  holds 
between  two  points  x  and  y  of  three-dimensional  space  when  the  distance 
from  X  to  A  is  less  than  or  equal  to  the  distance  from  y  to  A.  Which  of  the 
properties  Ref,  Trans,  Antisym,  and  Connex  does  R  have? 

X.6.3.     Prove: 

[-  {R,S):.S  e  Qord.R   —   xy(xSy.ySx):    D    :R  e  Q,ord:(x,y):xRy.    D    .yRx. 

X.6.4.     Prove  \-  A  e  Word. 
X.6.5.     If  we  define 

C  for         a3(a  C  /3), 

then  prove: 

h  (:S).2lCfS  e  Lattice. 

X.6.6.     Prove: 

(a)  \-  {R,x,y):.R  e  Rei.x,y  e  AY(R):  D  :xRy.  =  .x(R  -  I)y.y.x  =  y. 

(b)  \-  (R,x,7j):R  e  Ref.R  e  Connex.x,y  e  AY{R).  D  .x{R  -  I)y.v.x  =  y.y. 

y{R  -  I)x. 

(c)  |-  (R,x,y):R   e  Ref.i?   e  Antisj'm.i?   e  Connex.x,y  e  AV(7t).    D    . 

x{R  —  I)y  =  '^(yRx). 

(d)  |-  (R,x,y):R   e  Ref. 72   e  Antisym. 72   e  Connex..'c,y   e  AV{R).    D    . 

xRy  ^  r^{y{R  -  I)x). 

X.6.7.     Prove  h  {x) :  { {x,x) }  e  Word. 

7.  Equivalence  Relations.  We  have  defined  reflexive  and  transitive 
relations.  We  say  that  a  relation  R  is  symmetric  if  xRy  D  yRx.  So  we 
define 

Sym         for        R(x,y):xRy.  D  .yRx. 

Sym  is  stratified  and  has  no  free  occurrences  of  a  variable  and  so  may  be 
assigned  any  type. 
Theorem  X.7.1. 

I.  \-  {R):.R  €  Sym:  ^  :R  e  Reh (x,y):xRy.  D  .yRx. 
II.  \-  Sym  e  Rel. 

III.  h  C^,^):^  e  Sym.  D  .^]R\^  e  Sym. 

IV.  \-  (R):R  e  Sym.  D  .R  e  Sym. 

Notice  that  the  identity  relation  I  is  reflexive,  symmetric,  and  transitive. 
In  general,  any  kind  of  relation  of  equivalence  or  congruence  will  be  reflex- 
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ive,  symmetric,  and  transitive.  Hence  we  refer  to  any  relation  which  is 
reflexive,  symmetric,  and  transitive  as  an  equivalence  relation.    We  define 

Equiv        for        Ref  n  Sym  r\  Trans. 

Theorem  X.7.2. 

I.  (-  iR):R  €  Equiv.  =  .R  e  Ref .72  e  Sym.i?  e  Trans. 
II.  h  {I3,R):R  e  Equiv.  D  .^]R\^  e  Equiv. 

III.  [-  {R):R  e  Equiv.  D  .R  e  Equiv. 

IV.  \-  (R,x,y):.R  e  Equiv. xRy.  D  :{z).xRz  =  yRz. 

Proof  of  IV.  From  xRy  and  xRz,  we  get  yRx  by  72  e  Sym  and  then  yRz 
by  72  e  Trans.    Similarly,  from  xRy  and  yRz,  we  get  xRz. 

If  72  is  an  equivalence  relation,  and  we  have  xRy,  we  say  that  x  and  y  are 
equivalent  with  respect  to  72.  Because  72  is  transitive,  things  equivalent 
to  the  same  thing  will  be  equivalent  to  each  other. 

The  examples  of  equivalence  relations  are  many  and  important,  and  we 
cite  a  few. 

1.  The  relation  of  congruence  of  integers  modulo  an  integer  k,  for  a  fixed 
fc: 

72  =  xy{En).x  =  y  -\-  kn. 

2.  The  relation,  between  two  elements  x  and  y  oi  a.  group  G,  which  holds 
when  there  is  an  element  h  from  a  fixed  subgroup  H  such  that  y  =  xh: 

R  =  xy(Eh).h  e  H.y  =  xh. 

3.  When  S  e  Qord,  the  relation: 

72  =  xy{xSy.ySx). 

4.  The  relation  of  similarity  between  geometrical  figures. 

5.  The  relation  that  holds  between  two  ordered  pairs  of  positive  integers 
{x,y)  and  {u,v)  when  xv  =  yu: 

72  =  a^(Ex,y,u,v).a  =  {x,y).^  —  (u,v).xv  =  yu. 

6.  The  relation,  between  two  sequences  Xn(a„)  and  Xn(6„)  of  rational 
numbers,  which  holds  when  (e):.£  >  0:  D  :(EM,N)(m,n):m  >  M.n  >  N. 
D  .\a^  —  b„\  <  e. 

7.  The  relation,  between  classes,  of  having  an  equal  number  of  members. 
Given  an  element  x  in  AV(72),  we  can  form  the  class  of  all  elements 

equivalent  to  x,  namely,  R"{x}.  If  we  form  such  classes  for  every  x  in 
AV(72),  the  resulting  set  of  classes  is  called  the  set  of  equivalence  classes  of 
72.  Two  elements  are  equivalent  if  and  only  if  they  belong  to  the  same 
equivalence  class.  Two  equivalence  classes  either  have  no  member  in 
common,  or  else  they  are  identical.    Thus  we  define 
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EqC(R)        for        diEx).x  e  AY{R).a  =  R"{x} 

where  a  and  x  are  variables  not  occurring  in  R. 

EqC(i?)  is  stratified  if  and  only  if  R  is  stratified,  and  if  stratified  is  of 
type  one  higher  than  R.  All  free  occurrences  of  variables  in  EqC(R)  are 
those  in  R. 

Theorem  X.7.3.     h  (a,R):a  e  EqC(R).  =  .CEx).x  e  AV(/2).a  =  R"{x}. 

Theorem  X.7.4.  \-  iR):.R  e  Equiv:  D  :{x,y):xRy.  =  .x,y  e  k\{R). 
R-{x}  =  R"{y}. 

Proof.  Let /^e  Equiv.  If  xRy,  then  x,y  e  AV(R).  Also  by  Thm.X.7.2, 
Part  IV,  {z).xRz  =  yRz.  Then  by  Thm.X.4.22,  corollary,  {z):z  e  R"{x\. 
=  .2  e  R"{y}.    That  is,  R"{x}  =  R"{y}. 

Conversely,  let  x,y  t  AV(R)  and  R"{x}  =  R"{y}.  Since  R  e  Ref,  xRx. 
SobyThm.X.4.22,  corollary,  a:  ei?"{ a;}.  SoxeR"{y}.  So  yRx.  So  xRy, 
since  R  e  Sym. 

Theorem  X.7.5.  \-  (a,R):.R  e  Equiv.a  e  EqC(R):  D  :(x):X  e  a.  ^  . 
a  =  R"{x}. 

Proof.     Assume  R  e  Equiv  and  a  e  EqC(^).    Then  by  rule  C 

(1)  yeAY(R).a==R''{y}. 

Now  let  xe  a.  Then  xeR"{y}.  So  yRx.  So  by  Thm.X.7.4,  72"{?/}  = 
R"{x}.    Soa  =  R"{x}. 

Conversely,  let  a  =  R"{x}.    Then  by  (1), 

(2)  R"{x}  =  R"{y}. 

Since  y  e  AY(R)  and  R  e  Ref,  yRy.    So  by  Thm.X.4.22,  corollary, 

(3)  yeR''{y}. 

Hence  by  (2),  y  e  R"{x}.  Hence  by  Thm.X.4.22,  corollary,  xRy.  Since 
R  e  Sym,  yRx.  Hence  by  Thm.X.4.22,  corollary,  x  e  R"{y].  So  by  (1), 
X  e  a. 

Theorem  X.7.6.     ^  {ot,^,R):R  e  Equiv.a,/3  e  EqC{R).(x  n  /3  ?^  A.  D  .«  =  /3. 

Proof.  Assume  the  hypothesis.  From  a  r\  fi  9^  A  by  rule  C,  we  get 
xear\^.    SobyThm.X.7.5,  a  =  72"{a;}  andiS  =  i2"{a;}.    So  a  =  ,3. 

Theorem  X.7.7.    K^,^):^  ^Equiv.rc  e  AV(i?).  D  .{E^a).x  ta.atEqC{R). 

Proof.  Assume  R  e  Equiv  and  x  e  AV(i^).  Then  by  Thm.X.7.3, 
R''{x]  e  EqC(i2).  Also  xRx,  since  R  e  Ref,  and  so  x  e  R"{x}  by  Thm. 
X.4.22,  corollary.    So 

(1)  (Ea).X  €  a.a  e  EqC(iR). 

Now  assume  x  e  a.a  e  EqC(R).x  e  /3./3  e  EqC(i^).     Then  by  Thm.X.7.6, 
a  =  p.    So  by  (1)  and  Thm.VII.2.1,  we  get  (E,a).x  e  a.a  e  EqC{R). 
Theorem  X.7.8.     \-  ia,R):R  e  Equiv.a  e  EqC{R).  D  .a  9^  A. 
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Proof.  Assume  R  e  Equiv  and  a  e  EqC(R).  Then  by  rule  C,x  e  AY(R). 
a  =  R"{x}.  Then  xRx,  since  R  e  Ref.  So  x  e  R"{x]  by  Thm.X.4.22, 
corollary.    So  re  e  a,  and  a  9^  A. 

If  we  take  R  to  be  the  first  of  our  seven  listed  instances  of  equivalence 
relations,  namely,  congruence  modulo  k,  then  EqC(i?)  is  just  the  set  of 
residue  classes  modulo  k,  and  R''{x}  is  the  residue  class  of  x  modulo  k. 

If  we  take  the  second  R  listed,  then  EqC(-R)  is  the  set  of  left  cosets  of  H 
in  G  (see  Birkhoff  and  MacLane,  page  146)  and  R*^{x}  is  the  left  coset 
corresponding  to  a  given  x. 

By  dealing  with  the  equivalence  classes  of  R  rather  than  with  the  ele- 
ments of  AY(R),  we  have  the  effect  of  identifying  any  two  equivalent 
elements.  For  this  reason,  various  structural  features  of  AV(R)  will 
appear  in  EqC(R)  also,  but  simplified  by  the  elimination  of  any  distinction 
between  equivalent  elements. 

Thus  with  our  first  listed  R,  EqC(/^)  is  a  ring,  and  if  k  is  prime  it  is  even 
a  field.  With  our  second  listed  R,  EqC(i?)  is  a  group  if  ^  is  a  normal 
subgroup  (Birkhoff  and  MacLane,  page  158).  With  our  third  fisted  R, 
EqC(/2)  is  partially  ordered.  We  find  this  idea  of  using  equivalence  classes 
to  produce  the  effect  of  identification  of  elements  mentioned  on  page  3  of 
Lefschetz,  1942.  It  is  a  standard  device  in  many  parts  of  mathematics) 
particularly  higher  algebra. 

Besides  their  many  important  uses  in  mathematics,  equivalence  classes 
are  useful  in  formalizing  mathematical  notions.  For  instance,  suppose  we 
wish  a  term  denoting  the  shape  of  a  geometrical  figure.  Similarity  of 
geometrical  figures  is  defined  without  reference  to  shape,  but  nevertheless 
two  figures  have  the  same  shape  if  and  only  if  they  are  similar.  Hence  the 
equivalence  class  of  a  given  geometrical  figure  is  the  class  of  all  figures  with 
the  same  shape,  wherever  situated.  One  can  then  define  this  equivalence 
class  to  be  the  shape  of  the  figure.  Intuitively,  this  might  not  be  considered 
a  good  definition,  but  as  a  formal  definition  of  shape,  it  serves  very  nicely. 
In  a  certain  sense,  one  is  thus  thinking  of  a  given  equivalence  class  as  being 
the  abstraction  of  the  common  property  of  all  its  members. 

To  see  another  application  of  this  idea,  consider  our  fifth  listed  equiva- 
lence relation.  We  have  xv  =  yu  if  and  only  if  the  ratios  x/y  and  u/v  are 
equal.  Thus  our  relation  defines  equality  of  two  ratios  without  making  use 
of  the  notion  of  a  ratio.  Thus  we  can  use  the  corresponding  equivalence 
classes  to  define  the  notion  of  ratio. 

As  this  is  a  very  important  point,  let  us  be  more  explicit.  Suppose  we 
have  constructed  the  positive  integers  and  wish  to  construct  the  rational 
numbers.  We  can  use  the  ratios  between  integers  to  serve  as  rational 
numbers  in  case  we  can  define  the  ratios  between  integers.  Now  with  our 
fifth  listed  R,  we  have  (x,y)R{u,v)  if  and  only  if  the  ratio  x/y  equals  the 
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ratio  u/v.  So  the  equivalence  class  7^"  { (x,y)  \  of  {x,y)  will  be  the  class  of  all 
ordered  pairs  (u,v)  with  the  ratio  u/v  equal  to  the  ratio  x/y.  Hence  we  can 
take  this  equivalence  class  to  be  the  definition  of  the  ratio  x/y.  That  is, 
we  define 

^  =  R"{(x,y}}. 

We  shall  have 

R"[{3,4)}  =  R"{{9,12)} 


(see  Thm.X.7.4).    So  we  have 


3  ^  A 

4  12 


In  similar  fashion,  we  derive  other  familiar  properties  of  ratios. 

Incidentally,  the  foregoing  discussion  illuminates  the  distinction  between 
a  certain  ratio,  72"  {(3,4)}  and  various  names  of  it,  such  as  3/4,  9/12,  etc. 

Thus,  by  defining  the  notion  of  having  equal  ratios  without  referring  to 
ratios,  and  then  abstracting  from  this  notion  by  means  of  equivalence 
classes,  we  are  able  to  define  ratios,  and  thus  introduce  rational  numbers. 

How  can  we  proceed  from  rational  numbers  to  reals?  The  reals  can  be 
thought  of  as  limits  of  sequences  of  rationals.  If  we  can  define  the  notion 
of  two  sequences  having  the  same  limit  without  referring  to  limits,  then  we 
can  abstract  from  this  notion  by  means  of  equivalence  classes,  and  thus 
define  the  common  limit  of  the  sequences.  Our  sixth  listed  relation  is  just 
the  notion  of  two  sequences  having  the  same  limit.  Thus  R"{\n(a/) }  is  the 
class  of  all  sequences  Xn(6„)  with  the  same  limit  as  Xn(a„).  So  we  can  take 
the  equivalence  class  R"{\n(a„)}  of  a  sequence  Xn(a„)  as  constituting  the 
real  number  which  is  the  limit  of  the  sequence.  The  totality  of  such  limits 
will  comprise  the  set  of  real  numbers. 

If  we  can  define  our  seventh  listed  relation  without  reference  to  number, 
we  can  abstract  from  it  by  means  of  equivalence  classes  and  get  a  definition 
of  number.    This  will  constitute  the  developments  of  the  next  chapter. 

EXERCISES 

X.7.1.     Prove: 

(a)  \-  (R):R  e  Funct.  D  .R\R  e  Equiv. 

(b)  h  (R):R  e  Equiv.  ^  .RUSC(R)  e  Equiv. 

(c)  h  {R,S):S  e  Equiv.R  =   {({x},a)  |  x  e  AY(S).a  e  EqC(5;).a:  e  a}.  D  . 

R  e  Funct.RUSC(*S)  =  R\R. 

(d)  h  (S):S  e  Equiv.  ^  .(ER).R  e  Funct.RUSC(S)  =  R\R. 

X.7.2.     Prove  \-  Sym  n  Trans  C  Ref. 

X.7.3.     What  conditions  should  one  impose  on  X  so  that  R  =  xy(Ea). 
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a  e  \.x,y  €  a  should  be  an  equivalence  relation?  If  R  (as  defined)  is  an  equiv- 
alence relation,  what  are  AV(i?)  and  EqC(R)? 

X.7.4.  Prove  h  {R,S):S  e  Qord.R  =  xyixSy.ySx).  D  .a3(«,/3  e  EqC(i2): 
(Ex,y).xSy.x  e  a.y  e  /3)  e  Pord. 

X.7.5.  Prove  that  the  relation  of  being  equidistant  from  a  fixed  point  of 
three-dimensional  space  is  an  equivalence  relation.  What  are  the  corre- 
sponding equivalence  classes? 

X.7.6.  Prove  that  congruence  (modulo  1)  as  defined  in  Sec.  7  of  Chapter 
IX  is  an  equivalence  relation. 

X.7.7.     Prove: 

(a)  h  ^  «  Equiv. 

(b)  \-  I  e  Equiv. 

X.7.8.     What  are  EqC(A)  and  EqC(7)? 

X.7.9.     Prove  [-{R):Re  Funct.  D  .xy{x,y  e  Arg{R).R{x)  =  Riy))  e  Equiv. 

8.  Applications.  We  have  now  got  sufficiently  close  to  mathematics  that 
the  applications  begin  to  be  quite  direct,  and  we  have  noted  these  applica- 
tions as  we  progressed.  Thus,  in  Sec.  4,  we  noted  many  applications  of 
R\S,  R"a,  etc.,  as  we  proved  the  theorems  involved.  The  material  on 
functions,  in  Sec.  5,  is  of  constant  application.  Several  of  the  exercises  at 
the  end  of  Sec.  5  are  considered  to  be  purely  mathematical  theorems.  The 
material  of  Sec.  6  is  taken  almost  entirely  from  Birkhoff,  1948,  and  is  of 
constant  use  in  the  theory  of  ordered  sets.  The  theory  of  equivalence 
classes  presented  in  Sec.  7  is  of  frequent  use  as  a  means  of  ''identifying" 
sets  of  objects  in  some  structure  to  get  a  new  structure  with  special  proper- 
ties. Also,  many  useful  abstract  ideas  can  be  defined  as  equivalence  classes 
of  properly  chosen  equivalence  relations.  We  mentioned  the  definition  of 
rationals  as  equivalence  classes  of  ordered  pairs  of  integers,  and  the 
definition  of  reals  as  equivalence  classes  of  sequences  of  rationals.  In  later 
chapters  we  shall  define  cardinal  and  ordinal  numbers  as  equivalence  classes 
with  respect  to  properly  choson  equivalence  relations. 


CHAPTER  XI 
CARDINAL  NUMBERS 

I.  Cardinal  Similarity.  As  noted  at  the  end  of  the  preceding  chapter, 
if  we  can  define  the  notion  of  two  classes  having  the  same  number  of 
members,  without  referring  to  number,  then  we  can  define  number  by 
abstraction  from  this  notion.  Clearly  two  classes  have  the  same  number 
of  members  if  and  only  if  we  can  pair  each  member  of  the  first  class  with 
exactly  one  member  of  the  second  class  in  such  a  way  that  each  member  of 
the  second  class  is  paired  with  a  member  of  the  first  class.  The  collection 
of  pairs  which  comprise  such  a  pairing  would  constitute  a  1-1  relation  having 
the  members  of  the  first  class  for  its  arguments  and  the  members  of  the 
second  class  for  its  values.  Conversely,  given  such  a  1-1  relation,  the 
ordered  pairs  constituting  the  relation  would  comprise  a  pairing  of  all 
members  of  the  first  class  with  all  members  of  the  second  class. 

So  two  classes  have  the  same  number  of  members  if  and  only  if  there 
exists  a  1-1  relation  which  has  the  first  class  for  its  arguments  and  the 
second  class  for  its  values. 

We  say  that  a  and  /?  are  similar  (or  equinumerous)  with  respect  to  i2  if 
R  e  1-1  and  a  =  Arg(/^)  and  /?  =  Val(72).    That  is,  we  define 

a  sm«  ^         for         R  i  1-1. a  =  Aig(R).^  =  Yq\(R). 

a  and  /?  are  said  to  be  similar  (or  equinumerous)  if  there  is  a  relation  R 
with  respect  to  which  they  are  similar.    So  we  define 

sm        for        a'^(ER).a  sm^  ^3. 

Thus  sm  is  the  relation  of  having  the  same  number  of  members.  In  the 
next  section  we  shall  define  the  notion  of  cardinal  number  by  abstraction 
from  sm.  However,  first  we  prove  various  properties  of  sm,  beginning  with 
a  proof  that  it  is  an  equivalence  relation. 

If  a,  |S,  and  R  are  variables,  then  the  indicated  occurrences  of  a,  (8,  and  R 
are  the  only  free  occurrences  of  variables  in  a  sm^  fi,  and  a  sm^j  /?  is  stratified 
if  and  only  if  a,  j8,  and  R  all  have  the  same  type.  The  term  sm  is  stratified 
and  has  no  free  variables  and  may  be  assigned  any  type. 

Theorem  XI.1.1. 
I.  I  3(sm). 

II.  |-  sm  e  Rel. 

III.  \-  {a,ff):a  sm  fi.  =  .{^R).a  sm^  /?. 
*IV.  \-  {a,^):.a  sm  ^:  =  :{F.R):R  e  1-1.«  =  Alg(R).^  =  Val(72). 

345 
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Proof.     Use  Thm.X.3.5,  Thm.X.3.6,  corollary,  and  Thm.X.3.7. 

Theorem  XI.1.2.    \-  {a,^):a  sm  /3.  ^  .(^R).R  e  1-1. a  C  Arg(i?).j8  =  R"a. 

Proof.  Assume  a  sm  /3.  Then  by  rule  C  and  the  definition  of  a  sm^  (3,  we 
get  R  e  1-1. a  =  Arg(iE)./3  =  Val(i^).  Then  by  Thm.X.4.26,  we  get  72  e  1-1. 
a  Q  AigiR).l3  =  R"a.  Conversely,  assume  {ER).R  e  l~l.a  Q  Arg(R). 
i8  =  R'*a.  Use  rule  C  and  let  S  denote  a]R.  Then  by  Thm.X.5.19,  Cor.  5, 
S  e  1-1,  and  by  Thm.X.4.9,  Part  I,  a  =  Ai-g(*S).  Also,  by  the  definition 
of  i^'^a,  we  get /3  =  Val(»S).  So  a  sms /3.  So  a  sm /3.  - 
*Theorem  XI.1.3.     |-  (a). a  sm  a. 

Proof.     We  have  ^  /  e  1-1  and  [-  a  C  Arg(/).     Also  by  Ex.X.5.15, 
|-  a  =  /"a.    Accordingly,  by  Thm.XI.1.2,  |-  a  sm  a. 

Corollary  1.     h  Arg(sm)  =  Val(sm)  =  AV(sm)  =  V. 

Corollary  2.     |-  sm  e  Ref . 

Theorem  XI.1.4.     \-  (a,^,R,S):S  =  R.a  sm^  /3.  D  ./3  sm^  a. 

Proof.     Use  Thm.X.5.19,  Cor.  7,  and  Thm.X.4.16. 
^Corollary  1.     \-  {a,l3):a  sm  /3.  D  .^  sm  a. 

Corollary  2.     [-  sm  c  Sym. 

Theorem  XI.1.5.     \-  {a,l3,y,R,S, T):T  ^  R\S.a  sm«  /3.^  sm^  y.  3  .«  sm^  7- 

Proof.     Assume  the  hypothesis.    Then  by  Thm.X.5.19,  Cor.  6,  T  e  1-1. 
Also  by  Parts  III  and  IV  of  the  corollary  to  Thm.X.4.19,  Arg(r)  =  Arg(i?)  ■ 
=  a  and  Val(r)  -  Val(*S)  =  y. 
^Corollary  1.     \-  (a,(3,y):a  sm  /3./3  sm  y.  D  .a  sm  y. 

Corollary  2.     [-  sm  e  Trans. 

Corollary  3.     ^  sm  e  Equiv. 

There  is  in  intuitive  mathematics  a  theorem,  known  as  Cantor's  theorem, 
to  the  effect  that  no  class  is  similar  to  the  class  of  all  its  subclasses.  That 
is,  {a).'-^(a  sm  SC(a;)).  The  standard  proof  is  as  follows:  Assume  that 
a  sm  SC(a:),  so  that  there  is  a  1-1  correspondence  between  members  of  a 
and  subclasses  of  a.  Consider  those  members  of  a  which  are  not  members 
of  the  subclasses  with  which  they  are  paired.  Let  A  be  the  set  of  all  such. 
Then  A  is  a  subclass  of  a,  and  so  is  paired  with  some  x  which  is  a  member 
of  a.  If  a;  e  ^,  then  x  must  fail  to  be  a  member  of  the  subclass  with  which 
it  is  paired  (because  A  was  defined  to  contain  just  such  x's).  But  the 
subclass  with  which  x  is  paired  is  A,  and  so  '^  a;  e  A.  Thus  the  supposition 
a;  e  A  led  to  a  contradiction,  and  so  we  conclude  '^  x  e  A.  Recalling  that  A 
is  paired  with  x,  we  infer  that  x  is  not  a  member  of  the  subclass  with  which 
it  is  paired.  However,  A  contains  all  such  members  of  a,  so  that  x  e  A. 
Thus  we  again  get  a  contradiction,  and  so  refute  our  original  assumption 
that  a  sm  SC(q:). 

This  theorem  is  very  useful  in  the  classical  theory  of  cardinal  numbers. 
Unfortunately  it  seems  to  be  false.  For  ^  V  =  SC(V),  so  that  by  Thm. 
XI.1.3,  |-  V  sm  SC(V);  this  result  contradicts  the  theorem. 
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The  resulting  contradiction  is  the  Cantor  paradox,  and  it  is  difficult  to 
see  how  to  avoid  it  in  classical  mathematics.  However,  in  our  symbolic 
logic,  it  is  easily  avoided.  The  condition  by  which  we  defined  the  class  A, 
in  the  proof  given  above,  is  unstratified.  Thus  we  have  no  way  to  prove 
that  there  is  such  a  class  A,  and  the  proof  breaks  down. 

Actually,  if  we  modify  the  theorem  slightly,  then  the  proof  will  go 
through.    This  we  do  in  the  following  theorem. 

Theorem  XI.1.6.     h  («).'-'(USC(a)  sm  SC(a)). 

Proof.  Assume  USC(a)  sm  SC(a).  Then  by  rule  C,  72  e  1-1,  USC(a)  = 
Arg(R),  SC(a)  =  Ya\{R).    Put 

A  =  x(x  €  a.^(x  e  R{{x}))). 

A  is  stratified,  and  so 

(1)  (x):X  e  A.  =  .X  e  a.'-^{x  e  R({x})). 

So  by  (1),  AQa.  Hence  A  e  SC(a).  So  A  e  Ya\{R).  Then  by  rule  C, 
uRA.    As  R  €  1-1,  we  infer 

(2)  A  =  R(u). 

As  uRA,  we  get  u  t  Arg(R).  So  u  «  USC(q;).  Then  by  rule  C,u  =  {x}. 
xea.    Thenhy  (2),A  =  R({x}).    So  by  (1), 

X  €  A.  =  .X  e  a.'^(x  e  A). 
By  truth  values 

Taking  P  to  be  a;  e  ^  and  Q  to  be  a:  e  a  gives  ~(a:  e  a).  This  is  a  contra- 
diction. 

Intuitively,  one  would  expect  that  a  sm  USC(a:),  the  1-1  relation  being 
that  which  pairs  x  with  {x}  for  each  x  in  a.  However,  this  relation  would 
be  unstratified,  and  we  have  no  way  in  general  of  proving  that  it  exists. 
Actually  this  is  fortunate,  for  if  we  could  infer  (a). a  sm  USC(q:),  then  by 
Thm.XI.1.6  we  could  infer  (a).^(a  sm  SC(q:)),  and  then  the  Cantor 
paradox  would  be  forthcoming. 

Nevertheless,  for  most  of  the  classes  a  of  mathematics,  we  do  have 
a  sm  use  (a).    In  such  case,  we  say  that  a  is  Cantorian,  and  write 

Can(a)         for         a  sm  USC(a). 

Note.     Can  (a)  is  not  stratified. 

Theorem  XI.1.7.     [-  (a):Can(a).  D  .~(a  sm  SC(a)). 

Proof.  Assume  Can(a)  and  a  sm  SC(a).  Then  we  get  USC(a)  sm  a 
and  a  sm  SC(a),  whence  we  get  USC(a)  sm  SC(a).  By  Thm.XI.1.6, 
this  is  a  contradiction. 
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This  is  close  enough  to  Cantor's  theorem  that  we  shall  refer  to  it  by  that 
name.  Actually,  as  the  theorem  was  stated  by  Cantor,  the  hypothesis 
Can  (a)  was  missing.  This  is  because  it  was  "obvious"  to  Cantor  that 
a  sm  use  (a).  The  concept  of  stratification  had  not  then  been  invented. 
It  is  our  stratification  restriction  which  apparently  prevents  us  from  prov- 
ing Can(a)  for  general  a,  even  though  we  can  prove  it  for  many  a's.  In 
other  words,  because  of  our  stratification  requirements,  we  apparently 
cannot  prove  Cantor's  theorem  without  the  additional  hypothesis  Can  (a), 
and  thus  are  saved  from  the  Cantor  paradox. 

Theorem  XI.1.8.     \-  '-Can(V). 

Proof.  Put  V  for  a  in  Thm.XI.1.7.  Then  we  get  \-  (V  sm  SC(V)).  D  . 
~Can(V).  However,  by  Thm.IX.6.19,  Part  II,  h  V  =  SC(V).  So  by 
Thm.XI.1.3,  i-VsmSC(V). 

This  theorem  states  |-  ~(V  sm  USC(V)),  which  seems  intuitively  wrong, 
since  we  have  the  feeling  that  for  any  class  a  we  should  be  able  to  prove 
a  sm  USC(q:)  by  letting  x  correspond  to  {x}  for  each  x  of  a.  Actually, 
our  stratification  restrictions  apparently  prevent  this,  fortunately. 

We  have  here  the  first  case  in  which  our  formal  system  deviates  in  any 
important  particular  from  our  intuitive  ideas.  Actually,  we  shall  prove 
Can  (a)  for  the  more  common  classes  of  mathematics  (such  as  the  class  of 
integers,  the  class  of  real  numbers,  etc.),  so  that  for  most  purposes  of 
mathematics  our  formal  logic  is  still  following  the  intuitive  logic  fairly  well. 
In  fact,  the  classes  a  for  which  we  apparently  cannot  prove  Can  (a)  are  the 
classes  with  a  very  large  number  of  members  such  as  V,  or  the  class  of  all 
ordinal  numbers,  or  classes  of  this  sort,  which,  from  an  intuitive  point  of 
view,  are  not  at  all  sharply  defined.  For  such  large,  vague  classes  on  the 
frontier  of  our  comprehension,  it  is  not  really  surprising  that  some  proper- 
ties, such  as  Can  (a),  which  we  have  extrapolated  from  finite  classes,  should 
fail  to  hold. 

We  now  prove  some  results  which  will  be  needed  when  we  study  addition 
of  cardinal  numbers. 

Theorem  XI.1.9.     [-  (a)ia  sm  A.  =  .a  =  A. 

Proof.  By  Thm.XI.1.3,  j-  a  =  A.  D  .a  sm  A.  Now  assume  a  sm  A. 
Then  by  rule  C, 

R  e  1-l.a  =  Aig(R).A  =  Ya\(R). 

Then  by  Thm.X.4.3,  72  =  A  and  o:  =  A. 

Corollary.     \-  (a) -.a  sm  A.  =  .a  e  0. 

Theorem  XI.1.10.     \-  (x,y).{x}  sm  {y}. 

Proof.  Take  R  to  be  {(x,y)}.  Then  \-  (u,v):iiRv.  ^  .u  =  x.v  =  y  and 
h  R  €  Rel.    So  [-Re  1-1.  {x}  =  Arg(R).{y}  =  Val(i2). 

Theorem  XI.1.11.     \-  (a,x):a  sm  {x}.  =  .a  el. 
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Proof.  Let  «  sm  {a;}.  Then  by  rule  C,  R  e  1-1. a  =  ATg(R).\x}  =  Val(/2). 
So  a:  e  Val(i2),  and  by  rule  C,  yRx.    So  7/  e  Arg(R),  whence  yea,  whence 

(1)  {y}  C  a. 

To  show  that  also  a  Q  {^} ,  let  2  e  a.  Then  z  e  Arg(R)  and  by  rule  C,  zRw. 
&o  w  e  Vsi\(R),  w  e  {x},  x  =  w,  and  finally  zRx.  Since  yRx  and  R  e  1-1, 
we  get  z  =  y  and  z  e  {y\.    Then  by  (1)  a  —  {y},  and  thus  a  e  1. 

Conversely,  let  a  e  1.  Then  by  rule  C,  a  =  {y}.  Then  by  Thm.XI.1.10, 
a  sm  {a;}. 

Theorem  XI.1.12.  [-  ia,^,y,8,R,S,T):T  =  R  U  S.a  sm^  7./3  sm^  8. 
a  n  i8  =  A.7  n  5  =  A.  D  .(a  W  (8)  sm^  (7  W  5). 

Proof.  Assume  the  hypothesis.  Then  by  Thm.X.5.21,  Cor,  2,  T  e  1-1. 
Also  by  Ex.X.4.6,  Parts  (a)  and  (b),  a  W  /3  =  Arg(7')  and  7  W  5  =  Val(r). 

Corollary.  [-  {a,l3,y,8):a  r\  (3  =  A.7  n  5  =  A. a  sm  7./?  sm  5.  D  . 
(a  W  /3)  sm  (7  W  5). 

Theorem  XI.1.13.  \-  {a,^,x,y):a  ^J  [x]  =  (3  ^  {y].^  x  e  a.~  2/  e  /3. 
D  .a  sm  jS. 

Proof.     Assume 

(1)  ccKJ  {x}  =  ^W  {y}, 

(2)  '^  a;  e  a, 

(3)  '-'  y  e  ;S. 
Then  by  Thm.IX.6.6,  Cor.  1, 

(4)  a  r\  {x}  =  A. 

(5)  ■  /3  n  f?/}  =  A. 

Case  1.  x  =  y.  Then  {.c}  =  {y{.  Then  by  (1),  (4),  and  (5)  and 
Thm.IX.4.9,  Cor.  3,  we  get  a  =  /3.    So  a  sm  /3. 

Case  2.     x  9^  y.    By  (1),  x  e  jS  W  {?/}.    However,  ^^  a;  e  {y}.    So 

(6)  a;  €  |S. 
Similarly 

(7)  ye  a. 
Put 

(8)  7  =  «  -  [y]- 

Then  by  (7)  and  Thm.IX.6.5,  Cor.  3, 

(9)  a  =  y\J[y]. 
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Also 

(10)  A  =  y  r\{y}. 

By  (2)  and  (8),  ~  x  «  7.    So  by  Thm.IX.6.6,  Cor.  1, 

(11)  A  =  7  n  {x}. 
Now  let  ze  (3.    Then  by  (3), 

(12)  z  ^  y 
and  by  (1) 

(13)  2  ea  W  [x\. 

\i  z  =  X,  then  z  ty  \J  [x].  li  z  9^  x,  then  by  (13),  z  e  a,  and  so  by  (12) 
and  (8),  z  e  y,  and  hence  z  e  y  VJ  {x].    So 

(14)  i8C7W{a:}. 

Conversely,  let  2  e  7  W  {a;}.  If  2  e  7,  then  by  (8),  z  e  a  and  z  9^  y.  So 
hy(l),ze^.    If  2  e  {a;},  then  2  =  a;  and  by  (6),  2  e/8.    So  by  (14), 

(15)  ^  =  yKJ{x}. 

Now  [-  7  sm  7  by  Thm.XI.1.3  and  H-^l  sm  {y }  by  Thm.XI.1.10.  So  by 
(10),  (11),  (9),  and  (15),  we  get  a  sm  /3  by  Thm.XI.1.12,  corollary. 

We  now  prove  some  results  which  will  be  needed  when  we  study  inequal- 
ities between  cardinal  numbers. 

Theorem  XI.1.14.  \-  (^,R):R  e  1-1./3  C  Arg(i2).Val(i2)  C  ^.  D  . 
^  sm  (Val(i2)). 


Proof.     Assur 

ne 

(1) 

R  6  1-1, 

(2) 

/3  C  Arg(i2), 

(3) 

Val(7?)  C  ^. 

Define 

(4) 

7  =  Clos(/3  -  Val(/?),a:/22), 

(5) 

^  =  /3  n  7, 

(6) 

B  =  ^r\y, 

(7) 

C  =  Val(72)  n  7, 

(8) 

Z)  =  Val(/2)  n  7. 

We  have  immcdiatelj'- 
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(9)  [-AnB==A, 

(10)  ^C  r\D  =  A, 

(11)  [-AyjB  =  0, 

(12)  h  C  W  D  =  Val(ie). 

So  by  Thm.XI.1.12,  corollary,  it  suffices  to  prove  A  sm  C  and  B  sm  D. 
Clearly  by  (2)  and  (5) 

(13)  A  C  Arg{R), 
By  (4)  and  Thm.X.4.37, 

(14)  [-y  =   (fi  -  Ya\(R))  \J  R''y. 

Clearly  [-  (^  -  Val(/2))  C  /?,  and  by  (3)  and  Thm.X.4.24,  Part  I, 
R"y  Q  /3.    So  by  (14)  and  Thm.IX.4.18,  Part  II,  y  Q  13,  so  that  by  (5), 

(15)  A  ^  y. 

By  (14)  and  (15),  W'A  C  y.  Also  by  Thm.X.4.24,  Part  I,  R"A  C  Va\(R). 
So  by  Thm.X.4.18,  Part  I,  and  (7) 

(16)  R"A  C  C. 
By  (7)  and  (14), 

[-  C  =  YaXiR)  n  ((13  -  ya\(R))  W  R"y) 

=  (Vsi\(R)  n  (^  -  Val(/e)))  W  (Val(i^)  n  J?"7) 
=  A  w  (Val(/2)  n  /^"t) 
=  Val(/^)  n  i?"7. 

So  [-  C  C  i^'V,  and  so  by  (15),  C  Q  R''A.    With  (16),  this  gives 

(17)  C  =  R''A. 
Then  by  (1),  (13),  (17),  and  Thm.XI.1.2, 

(18)  A  sm  C. 
By  (3),  (6),  (8),  and  Thm.IX.4.16,  Part  I, 

(19)  D  QB. 

By  (4)  and  Thm.X.4.34,  \-  0  -  Val(/2)  C  y.  So  by  Thm.IX.4J7, 
[~  7  C  |3  W  Ya\(R).  So  by  (6)  and  Thm.IX.4.16,  Part  I,  h  5  C  ^  n  (j8  U 
Yal(R)).  So  by  Thm.IX.4.4,  Part  XVIII,  \- B  Q  13  n  Vsi\(R).  Then  by 
(3),  B  C  Vsi\(R).  Also,  by  (6),  B  Q  y.  So  by  (8)  and  Thm.IX.4.18, 
Part  I,BQD.    So  by  (19),  B  =  D  and  hence 

(20)  B  sm  D. 


352 


LOGIC  FOR  MATHEMATICIANS 


[Chap.  XI 


Our  theorem  now  follows  by  Tlim.XI.1.12,  corollary. 

Let  us  consider  a  geometrical  illustration  of  this  theorem.  Let  a  consist 
of  the  points  interior  to  a  square  and  let  jS  consist  of  the  points  interior  to 
the  circle  inscribed  in  the  square.  Let  R  he  a,  one-to-one  transformation 
which  shrinks  the  square  in  the  ratio  l/\/2.  Then  Arg(i2)  is  a  and 
Val(i?)  consists  of  the  points  interior  to  the  square  which  we  have  shown 
inscribed  in  the  circle  in  Fig.  XI.l.L    We  have  ^  C  Arg{R)  and  Val(i2)  C  /3. 


Fig.  XI.1.1. 


Fig.  XI.1.2. 


Then  by  the  theorem  just  proved,  13  sm  Ya\(R).  That  is,  there  is  a  one-to- 
one  correspondence  between  the  interior  points  of  the  circle  and  the  interior 
points  of  the  square  inscribed  in  the  circle. 

One  could  set  up  directly  such  a  one-to-one  correspondence  as  follows. 
We  note  that  jS  —  Ysi\{R)  is  a  region  shaped  like  the  shaded  portion  of 
Fig.  XL  1.2.  Since  R  shrinks  figures  in  the  ratio  1/V2,  R'\l3  -  Val(i?)) 
is  a  smaller  region  of  similar  shape  lying  just  inside  the  given  region, 
R"R"(^  —  Ya\{R))  is  a  similar  still  smaller  region,  etc.  Now  we  make  the 
points  of  /3  correspond  to  those  of  Ya\{R)  by  the  following  scheme.  The 
points  of  /3  -  Val(S)  in  jS  are  to  be  paired  with  the  points  of  E"(/3  —  Yal(R)) 
in  Ysi\(R).  The  points  of  J?"(/3  -  Yal(R))  in  ^  are  to  be  paired  with  the 
points  of  R''R'\I3  -  Val(E))  in  Ya[(R).  We  continue  in  this  way,  pairing 
regions  5  in  /S  with  E"5  in  Y&\(R).    By  this  process  we  pair  the  points  of 

(/3  -  Ya\(R))  W  R'\I3  -  Val(i?))  W  R"R'\^  -  Val(i2))  W  •  •  • 

in  /3  with  the  points  of 

R"{0  -  Y^\{R))  \J  R"R'\^  -  YaliR))  w  •  •  • 

in  Val(i?).  The  remaining  set  of  unpaired  points  of  /3  Hes  in  Ya\(R),  and 
coincides  exactly  with  the  set  of  unpaired  points  of  Yal{R).  All  these 
points  are  then  paired  with  themselves. 

In  point  of  fact,  the  above  scheme  for  setting  /3  and  Ya\{R)  into  one-to-one 
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correspondence  is  just  that  ussd  in  our  proof  of  Thm.IX.1.14.  Thus,  by 
taking  7  to  be  Clos(i3  —  Ysd(R),xRz),  we  arrange  that  7  is 

(13  -  YaXiR))  W  R"(f3  -  Y&l(R))  w  R"R"{(3  -  Val(72))  vj  •  •  •  • 

Our  definitions  of  A,  B,  C,  and  D  are  such  that  A  is  just  7  and  C  is  just 
/e"^,  so  that  C  is 

/2"(/3  -  Val(7?))  W  i?"/^"(/3  -  Val(/2))  W  •  •  •  • 

Then  the  points  of  A  and  C  are  paired  by  R.  It  also  turns  out  that  B  ^  D, 
and  so  the  points  of  B  and  D  are  paired  with  themselves. 

Our  Theorem  XI.  1.1 4  is  a  lemma  for  the  Schroder-Bernstein  theorem  to 
the  effect  that,  if  a  is  similar  to  a  subset  of  ^  and  /3  is  similar  to  a  subset  of  a, 
then  a  is  similar  to  /3. 

Theorem  XI.1.15.     |-  {a,l3,'y,8):a  sm  7.7  C  (3.13  sm  5.5  C  «.  D  .a  sm  /3. 

Proof.     Assume  a  sm  7,  7  ^  /?,  /5  sm  5,  5  C  «.    Then  by  rule  C, 

R  e  1-l.a  =  Arg(7^).7  =  Ysi\(R), 

S  e  1-1  ./3  =  Arg(>S).5  =  Val(^). 

We  have  Ya\(R)  Q  Arg(*S).  So  by  Thm.X.4.10,  Part  II,  R  =  R\ATg(S). 
So  by  Thm.X.4.19,  Part  I, 

(1)  Axg(R\S)  =  Aig(R)  =  a.  . 

Also  by  Thm.X.4.19,  corollary.  Part  II,  Ya\(R\S)  C  Val(^).    So 

(2)  Ysd(R\S)  e  5. 

By  (1), 

(3)  5  C  ATg{R\S). 

Also  by  Thm.X.5.19,  Cor.  6, 

(4)  R\S  e  1-1. 
Hence  by  (2),  (3),  (4),  and  Thm.XI.1.14, 

(5)  8  sm  Ya.\(R\S). 

However,  by  definition  of  smr,  \i  T  e  1-1,  then  Arg(T')  sm^  Val(T). 
Taking  T  to  be  R\S  gives  Arg(i?|*S)  sm  Y&l{R\S).  So  by  (1)  and  (5),  a  sm  5. 
However,  jS  sm  8.  So  a  sm  /3. 

We  now  prove  some  results  which  will  be  needed  when  we  study  multi- 
plication of  cardinal  numbers. 

Let  a  and  /3  be  finite  classes,  with  m  members  in  a  and  n  members  in  /3. 
How  many  members  does  a  X  /3  have?  To  get  a  member  of  a  X  l3  we 
must  choose  a  member  x  from  a  and  a  member  y  from  (3  and  form  the 
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ordered  pair  (x,y).  For  each  given  x  of  a,  we  can  choose  y  from  /3  in  n  ways 
and  form  n  ordered  pairs  {x,y).  Doing  this  for  each  of  the  m  members  x  of  a, 
we  get  m  groups  of  n  ordered  pairs.  Clearly  these  ordered  pairs  are  all 
distinct,  so  that  we  have  a  total  of  m  X  n  ordered  pairs.  That  is,  a  X  /3 
has  m  X  n  members.  We  generalize  to  the  case  in  which  either  a  or  /3  is  an 
infinite  class,  and  define  the  product  m  X  n  as  the  number  of  members  of 
a  X  (3. 

The  next  theorem  and  its  mate,  Thm.XI.1.18,  state  that  the  number  of 
members  of  a  X  13  depends  only  on  the  numbers  of  members  of  a  and  jS, 
respectively,  and  not  on  any  other  properties  of  a  and  /3.  This  is  as  it 
should  be. 

Theorem  XI.1.16.     \-  {a,l3,y):a  sm  /3.  D  .(«  X  t)  sm  {0  X  y). 

Proof.  Let  «  sm  (3.  Then  by  rule  C,  R  e  1-1. a  =  Arg(72)./S  =  Ysd{R). 
Put 

S  =  u{)(Ex,y,z).ii  =  {x,z).v  =  {y,z).xRy. 

Clearly  S  is  stratified  and  so 

(1)  S  €  Rel. 

Now  let  uSvi  and  uSv-z.    Then  by  rule  C,  ' 

u  =  (x,,Zi).Vi  =  (yi,Zj).XiRyi 
and 

u  =  {x2,Z2).V2  =  (y  2,22). XzRy  2. 

So  Xi  =  X2  and  Zi  =  Za-  Then  from  XiRy^,  X2Ry2,  and  R  e  1-1,  we  get 
2/1  =  ^2-    So  V,  =  V2.    Similarly,  from  UiSv  and  U2SV,  we  get  w,  =  Wj.    Hence 

(2)  5  €  1-1. 

Now  let  w  €  a  X  7.  Then  by  Thm.X.2.12  and  rule  C,u  =  {x,z).x  e  a.z  e  y. 
Then  x  e  Arg(R)  so  that  xRy.    So  uS{{y,z)).    So  u  e  ATg(S).    Thus 

(3)  a  X  7  ^  Arg(5). 

Now  let  y  €  /3  X  7.  Then  similarly  v  =  (y,z).y  e  /3.2  c  7  and  xRy,  and 
{(x,z))Sv.  Accordingly  x  e  ATg(R)  so  that  x  e  a.  Then  {x,z)  t  a  X  y.  So 
V  e  S"(a  X  y).    Thus 

(4)  ^XyQ  S"(a  X  7). 

Now  let  V  €  S"(a  X  7).  Then  u  e  a  X  y.uSv.  So  by  rule  C  with 
Thm.X.2.12,  u  =  {Xi,Zi).Xi  e  a.Zi  e  7  and  u  =  {x,z).v  =  {y,z).xRy.  So  2  =  2i 
and  2  €  7.  Also,  from  xRy,  we  get  ?/  e  Yq\{R)  and  so  ?/  e  /3.  So  ?;  =  (2/,2). 
y  e^.z  e  7.    Thence  «;  « 18  X  7.    Thus  »S"(a  X  7)  ^  /3  X  7.    So  by  (4), 

(5)  /3  X  7  =  'S'Xa  X  7). 
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Then  by  (2),  (3),  (5),  and  Thm.XI.1.2,  our  theorem  follows. 
We  could  prove 

|-  (a,l3,y):a  sm  /3.  D  .(7  X  a)  sm  (7  X  /3) 

by  the  same  methods.     However,  we  shall  give  an  alternative  proof  later. 

We  now  prove  that  a  X  /3  and  13  X  a  have  the  same  number  of  elements. 
In  other  words,  multiplication  of  cardinal  numbers  is  commutative.  In 
case  a  and  j8  are  classes  of  real  numbers,  one  can  give  an  intuitive  proof  as 
follows.  Mark  the  members  of  a  as  points  on  the  a:-axis  and  the  members  of 
13  as  points  on  the  ?/-axis.  Then  the  members  {x,y)  of  a  X  jS  are  just  the 
points  (x,y)  in  the  x-y  plane  which  are  simultaneously  on  a  vertical  line 
through  a  point  of  a  and  on  a  horizontal  line  through  a  point  of  /5.  If  now 
we  reflect  the  entire  figure  in  the  45-degree  line  y  =  x,  the  points  oi  a  X  fi 
go  into  the  points  of  /3  X  «. 

This  is  the  basic  idea  of  the  proof  which  we  now  give. 

Theorem  XI.1.17.     \-  {a,i3).(a  X  /3)  sm  {(3  X  a). 

Proof.     Put 

R  =  xy(Eu,v).x  —  {u,v).y  =  {v,u). 

One  proves  easily 

(1)  [-Re  1-1 

and  \-  Aj-g(R)  =  V  X  V.    So 

(2)  aX  ^  Q  A-rg(R). 

Now  let  y  t  p  X  a.  So  by  rule  C  and  Thm.X.2.12,  y  =  (v,u).v  e  ^.u  e  a. 
So  {u,v)  e  a  X  jS  and  {u,v)Ry.    Hence  y  e  /»*"(«  X  /5). 

Conversely,  let  y  e  R"(a  X  /3).  So  by  rule  C,  xRy  and  x  e  a  X  ^.  Then 
by  rule  C  with  xRy,  we  get 

X  =  {u,v).y  =  (v,u), 

and  by  rule  C  with  Thm.X.2.12, 

X  =  {z,w).z  e  a.w  e  /?. 

Then  z  =  u  and  v  =  w.  So  y  =  (v,u).v  e  /3.m  e  a.  That  is,  y  e  0  X  a. 
So 

(3)  [^13  X  a  =  R"(a  X  iS). 

From  (1),  (2),  and  (3),  our  theorem  follows  by  Thm.XI.1.2. 
Theorem  XI.1.18. 

I.  |-  {a,^,y):a  sm  /3.  D  .(7  X  a)  sm  (7  X  /3). 
II.  |-  (a,/3,7,5):a  sm  7.j8  sm  8.  D  .{a  X  13)  sm  (7  X  5). 
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Proof  of  1.  By  Thm.XI.1.17,  \-  (y  X  a)  sm  (a  X  7)  and  h  (^  X  7)  sm 
(7  X  iS).    Also,  if  a  sm  fi,  then  (a  X  7)  sm  (/3  X  7)  by  Thm.XI.1.16. 

Proof  of  II.  From  a  sm  7  we  get  (a  X  /5)  sm  (o'  X  |S)  and  from  |S  sm  5 
we  get  (7  X  ;S)  sm  (7  X  8). 

We  now  give  proofs  for  theorems  which  will  lead  to  the  two  results  that 
cardinal  multiplication  is  associative  and  that  if  a  cardinal  number  is 
multiplied  by  unity  the  product  is  the  number. 

Theorem  XI.1.19.     \-  {a,0,y).{(a  X  ^)  X  7)  sm  (a  X  W  X  7)). 

Proof.     Put 

R  =  xy(Eu,v,w).x  —  {{u,v),w).y  =  (u,{v,w)), 

and  proceed  as  in  the  proof  of  Thm.XI.1.17  to  prove 

(1)  \-Re  1-1, 

(2)  h  («  X  /3)  X  7  C  Arg(R), 

(3)  h  «  X  (/3  X  7)  =  R'\ia  X  ^)  X  7). 

Theorem  XI.1.20.     \-  icx,x).a  sm  (a  X  {x}). 
Proof.     Put 

R   =   i^{y  =   (z,x)), 

and  proceed  as  in  the  proof  of  Thm.XI.1.17  to  prove 

(1)  ^Rel-1, 

(2)  h  «  ^  Arg(/2), 

(3)  h«  X  {x\  =  R"a. 

We  now  prove  some  results  which  will  be  needed  when  we  study  expo- 
nentiation of  cardinal  numbers.    We  first  introduce  the  definition 

a  A  /3        for        R{R  €  Funct.Arg(i2)  =  a.Val(i2)  C  ^). 

If  a  and  |S  are  variables,  the  explicitly  indicated  occurrences  of  a  and  /3 
are  free  and  are  the  only  free  occurrences  of  any  variables  in  a  /^  ,8.  Also 
a  A  /3  is  stratified  if  and  only  if  a  and  /3  have  the  same  type,  and  the  type 
of  a  /\-  ,3  will  be  one  greater  than  the  type  of  a  and  /3. 

The  motivation  for  the  definition  oi  a  /^  ^  comes  about  as  follows.  Let 
a  and  |S  be  finite  classes,  with  m  members  in  a  and  n  members  in  /3.  How 
man}^  members  does  a  /^  13  have?  To  get  a  member  of  a  /^  /3,  we  must 
form  a  function  R  whose  argument  range  is  a  and  whose  values  all  lie  in  /3. 
To  form  such  a  function  R,  we  have  to  designate  a  member  of  (3  to  go  with 
each  member  of  a.  To  the  first  member  of  a,  we  can  assign  any  of  the  11 
members  of  /3,  so  that  we  have  n  possible  choices.  Quite  independently^,  we 
can  assign  a  member  of  /3  to  the  second  member  of  a  in  n  distinct  ways. 
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Indeed,  quite  independently,  we  can  assign  members  of  0  to  each  given 
member  of  a  in  n  distinct  ways.  Thus  the  total  number  of  possible  ways  to 
assign  a  member  of  /3  to  each  member  ofaisnXwX  •••  Xn  ways,  where 
the  product  is  taken  over  m  factors.  That  is,  n'"  distinct  functions  can  be 
defined  such  that  they  assign  a  member  of  /3  to  each  member  of  a.  Thus 
a  /^  IS  has  n""  members. 

Accordingly,  we  can  define  exponentiation  for  finite  cardinals  in  terms  of 
a  /V  ,8.  Indeed,  we  can  and  will  define  exponentiation  for  any  cardinal 
numbers,  finite  or  infinite,  in  terms  of  a  /^  jS. 

The  use  of  /^,  which  is  just  an  upside-down  V,  as  a  symbol  for  use  in 
exponentiation  has  occurred  elsewhere  (see  Quine,  1940,  for  instance). 

We  now  proceed  to  prove  a  series  of  theorems  about  a  /^  /3  from  which 
the  familiar  laws  of  exponents  will  be  forthcoming. 

It  is  difficult  to  motivate  the  proofs  of  these  theorems.  The  theorems 
themselves  are  devised  by  considering  the  laws  of  exponents  and  noting 
what  theorems  about  similarity  are  needed  to  prove  them.  In  some  cases 
it  was  fairly  clear  how  to  choose  the  1-1  relation  which  would  prove  the 
desired  similarity.  However,  since  a  /V  /3  is  a  type  higher  than  a  and  /3, 
the  issue  is  somewhat  confused  by  the  necessity  for  preserving  stratificatioD . 
Thus,  in  Thm.XI.1.30,  we  shall  find  USC(5)  at  a  place  where  intuitively 
we  should  expect  to  find  8.  From  an  intuitive  point  of  view,  Can(5)  is 
obvious,  so  that  it  would  not  matter  whether  we  have  8  or  USC(5).  How- 
ever, in  the  symbolic  logic,  we  know  how  to  prove  Can(5)  for  only  special 
5's  and  so  must  take  care  to  distinguish  between  USC(5)  and  8  in  general. 

In  the  next  section,  in  which  we  prove  the  laws  of  exponents,  it  will  be 
seen  which  law  of  exponents  is  derived  from  each  of  the  A'arious  theorems 
which  we  now  prove. 

Theorem  XI.1.21. 

I.   hK/3).3(aA^)- 
*1I.  h  {cx,(3,R):R  e  {a  /V  jS).  =  .R  e  Fimct.AsgiR)  =  a.Val(/?)  Q  13. 
Theorem  XI.1.22.     \-  ia,^,y):a  sm  /3.  D  .(a  A  7)  sm  (/3  A  t)- 
Proof.     Let  a  sm  j3.    Then 

(1)  R  e  l-l.Arg(/?)  =  cx.Ysi\(R)  =  (3. 

Let 

(2)  w  =  (ccAyms(R\S)). 

By  Thm.X.5.23,  Part  II,  and  Thm.X.5.7,  corollary, 

(3)  W  e  Funct, 

and  by  Thm.X.5.23,  Part  IV,  and  Thm.X.4.9,  Part  I, 

(4)  Arg(TF)  -  (a  A  7). 
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Let  SWT.  Then  by  (2),  S  e  Rel,  Arg(S)  =  a,  and  T  =  R\S.  So 
R\T  =  {R\R)\S.  However,  by  Thm.X.5.12,  corollary,  and  (1),  R\R  = 
Ysd(R)]I  =  Arg{R)]I  -  a]I.  So  by  Thm.X.4.20,  Part  I,  R\T  =  (a]I)\S  = 
a](I\S)  =  a]S  by  Thm.X.5.11.  So  by  Arg(*S)  =  a  and  Thm.X.4.10,  Part  I, 
R\T  =  S.    So  SWT.  D  .S  =  R\T.     Hence  W  e  Funct,  and  so  by  (3), 

(5)  W  €  1-1. 

Now  let  T  e  Val(TF).  Then  S  e  Funct,  hxg{S)  =  a,  Ya\(S)  Q  y,  T  =  R\S. 
ThenbyThm.X.4.19,  corollary,  Part  HI,  Arg(r)  =  Arg(^|*S)  =  Arg(^)  = 
YaA(R)  =  13.  Also  by  Thm.X.4.19,  corollary.  Part  H,  Ya\(T)  C  y.  Also  by 
Thm.X.5.9,  ^|*S  e  Funct,  so  that  T  e  Funct.    Then  T  e  (^  A  t).    So 

(6)  Val(TF)  C  (|3  A  t)- 

Conversely,  let  T  e  (^  A  t).  So  T  e  Funct,  Arg(T)  =  0,  Val(r)  C  y. 
Then  R\T  e  Funct,  Arg(i^|r)  =  a,  Ysi]{R\T)  Q  y.  Also  R\(R\T)  = 
{R\R)\T  -  W)\T  =  ^]{I\T)  =  I3]T  =  T.  So  by  (2),  iR\T)WT.  Hence 
T  e  Val(TF).    So  by  (6), 

(7)  Val(TF)  =  (/3  A  t). 

Then  by  (5),  (4),  and  (7),  (a  A  t)  sm  (8  A  t)- 

Theorem  XI.1.23.     j-  {a,l3,y):a  sm  /3.  D  .(7  A  «)  sm  (7  A  |S). 
Proof.     Let  a  sm  13.    Then  P  e  l-l.Arg(i?)  =  a.Yal{R)  =  /3.    So  let 

W  =  (y  A  cc)]{\S{S\R)) 

and  proceed  as  in  the  proof  of  Thm.XLl.22. 

Corollary.     |-  {a,^,y,d):a  sm  y.l3  sm  5.  D  .(a  A  1^)  sm  (7  A  ^)- 

Theorem  XL1.24.     ^  (^)-(A  A  /?)  =  0. 

Proo/.  Let  P  e  (A  A  /3).  Then  Arg(P)  =  A.  So  by  Thm.X.4.3,  Part  I, 
R  =  A.  So  P  €  0.  Conversely,  let  P  e  0.  Then  P  =  A.  So  P  e  Funct. 
Arg(P)  =  A.Val(P)  Q  (3.    So  P  e  (A  A  ^)- 

Theorem  XI.1.25.     (-  (a) -.a  ^  A.  D  .(a  A  A)  =  A. 

Proof.  Let  P  e  (a  A  A).  Then  Arg(P)  =  a,  Val(P)  C  A.  So  by 
Thm.IX.4.13,  Cor.  2,  Val(P)  =  A,  P  =  A,  Arg(P)  =  A,  a  =  A.    Thus 

h  P  €  (a  A  A).   3  .a  =  A. 
So 

[-  a  5^  A.  D  ,~  P  €  («  A  A). 

So  by  Thm.IX.4.12,  Part  II, 

y  a  9^  A.  D  .(a  A  A)  =  A. 

Theorem  XL1.26.     \-  ia,x):{{x}  A  «)  =  USC({x}  X  a). 

Proof.     Let  P  e  ({a;l  A  «)•    Then  P  e  Funct.Arg(P)  =  {a:}.yal(P)  e  a. 
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Then  x  e  Arg(72),  so  that  by  Thm.X.5.3,  Cor.  4,  R{x)  e  Val(/^),  so  that 
R(x)  e  a  and 

(1)  (x,R(x))  €  {x}  X  a. 
Also  (x,R(x))  €  R,  so  that 

(2)  {{x,R(^'))]  ^  R- 

In  addition  to  our  assumption  R  e  {{x}  /^  a) ,  \et  us  further  assume  z  e  R. 
Then  by  Thm.X.3.1  and  rule  C,  z  =  {u,v)  and  uRv,  so  that  u  e  ATg(R). 
Hence  u  e  [x\,  u  =  x,  so  that  xRv.  By  Thm.X.5.3,  v  =  R(x),  so  that 
z  =  (x,R(x)).    So  z  e  {{x,R{x))\.    Thus  by  (2), 

(3)  •  {{x,R(x))}  =  /2. 

Then  by  (1),  (3),  and  Thm.IX.6.10,  Part  III,  R  e  USC({x}  X  a).  So 
finally 

(4)  h({^}  A«)  QVSC({x]  X  «). 

Conversely,  let  R  e  USC({a;}  X  a).  Then  by  rule  C,  R  =  {z]  and 
z  e  {x}  X  oi.    ^o  R  C.  {x}  X  a,  so  that 

(5)  R  €  Rel 

Also  z  e  R,  and  by  Thm.X.2.12  and  rule  C,  z  =  (u,v),  u  e  fa;},  y  e  a.  So 
uRv  and  u  =  x.    Thus  a:i^y,  giving 

(6)  {x}  C  Arg(i?), 

(7)  {W  C  Val(i2). 

In  addition  to  our  assumption  R  e  USC({.c}  X  a),  let  us  further  assume 
wRy.  Then  (w,?/)  e  R,  So  (it^,?/)  =  2  =  (i^,?^'),  giving  w  =  u,  y  =  v,  and 
hence  w  —  x.    Thus 

lyi^t/.  "^.10  =  x.y  =  V. 

From  this  follows 

(8)  R  €  Funct, 

and  Arg(7?)  Q  {x},  Val(/^)  C  {y}.    Then  by  (6)  and  (7), 

(9)  ATg(R)  =  {x} 

and  Val(i2)  =  {v}.    However,  we  had  v  e  a,  so  that 

(10)  YaliR)  C  a. 
Thus/2  e  ({x}  A«),  and 

(11)  hUSC({x}  X  a)  C  ({x]  A«). 
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Theorem  XI.1.27.     h  K3:)-(«  A  l^})  =  {«M^)!- 

Proof.  Let  R  e  (a  A  I^O-  Then  R  e  Funct.Arg(/^)  =  a.Ya\(R)  Q  [x]. 
If  in  addition  uRv,  then  we  get  u  ea,v  ^  x,  so  that  by  Thm.X.5.23,  Part  I, 
u(a]\y(x))v.  Conversely,  if  u{a]\y(x))v,  then  uea,v  =  x,so  that  u  e  Arg(i^). 
Hence  uRw,  w  e  Ya\(R),  w  =  x,  w  =  v,  and  finally  uRv.  Accordingly,  we 
have  sho^vn  R  =  a]\y(x).    Hence 

(1)  \-iaA{x])Q{a]\y(x)]. 

Conversely,  let  R  e  {a]\y(x)].  Then  R  =  a]\y(x).  So  R  e  Funct  and 
Arg(R)  =  a.  Let  in  addition  v  e  Val(i2).  Then  uRv.  So  w  e  a.u{Xy(x))v. 
So  by  Thm.X.5.23,  Part  I,  v  ^  x.  ^ov  e  {x}.  Hence  Y&\{R)  Q  {x}.  So 
R  e{aA  {x})-     Thus 

(2)  \-{oc]Xy(x)}  Q(aA  {x}). 

The  proof  of  the  next  theorem  is  based  on  the  idea  of  characteristic 
functions.  If  a  is  a  set  of  real  numbers,  then  we  say  that  /  is  the  corre- 
sponding characteristic  function  provided  that  for  every  real  x 

..  .         ( 1         if         X  eoc 

f(x)  =   I 

(0         if         ~'  re  €  a. 

Clearly,  each  a  determines  a  unique  characteristic  function,  and  conversely 
each  characteristic  function  determines  a  unique  a. 

More  generally,  if  a  is  a  subclass  of  a  class  /3,  we  say  that/  is  a  character- 
istic function  corresponding  to  a  if  for  every  x  in  13 

...         i  1         if        a;  e  a 

fix)  =  j 

(0  if  r^  x  e  a. 

Then  there  is  exactly  one  characteristic  function  for  each  subclass  of  jS,  and 
vice  versa.  So  the  class  of  characteristic  functions  over  B  is  equinumerous 
with  SC(/3).  However,  if  A  is  {1,0},  then  the  class  of  characteristic 
functions  ovei  ^  is  exactly  13  A  A.    So  (^  A  ^)  sm  SC(,5). 

Clearly  there  is  nothing  magical  in  the  choice  of  1  and  0  as  the  values  for 
our  characteristic  functions  over  /S.  When  /3  is  the  class  of  real  numbers, 
there  is  a  certain  analytical  convenience  in  the  choice  of  1  and  0.  However, 
in  general  we  may  use  any  two  distinct  objects.  So,  in  general,  if  A  e  2, 
then  (/3  A  ^)  sm  SC (i3). 

This  is  our  next  theorem,  and  its  proof  is  essentially  that  just  indicated. 

Theorem  XI.1.28.     \-  (a) .-a  e  2.  D  .(/3).(/3  A  «)  sm  SC(/3). 

Proof.     Let  a  e  2.    Then  by  Thm.X.1.16,  Cor.  1,  and  rule  C, 

(1)  x^  y, 

(2)  a  =  {x,y}. 
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Put 

(3)  w  =  (^Aa)](\R(rr{x])). 

Then 

(4)  W  €  Funct, 

(5)  Arg(TF)  =  (^A«)- 

As  an  additional  assumption,  let  RWcj)  and  SW(j).    Then 

(6)  R,S  €  Funct, 

(7)  ATg(R)  =  ^  =  Arg(S), 

(8)  .  Val(7^)  C  a, 

(9)  Val(,S)  C  a, 

(10)  ^"{a;}  =  (^  =  S"{x}. 

We  now  wish  to  prove  R  —  S  by  use  of  Thm.X.5.16,  to  which  end  we 
make  the  further  assumption  z  e  Arg{R).  Then  by  (7)  and  Thm.X.5.3. 
Cor.  1, 

(11)  zRiR(z)), 

(12)  zS(S(z)). 

So  R{z)  €  Ysi\(R)  and  ^(2)  e  Val(>S),  so  that  by  (8)  and  (9),  R(z),  S(z)  e  a. 
Then  by  (2)  and  Thm.IX.6.2,  Part  II, 

(13)  R{z)  =  xyR{z)  =  y, 

(14)  S{z)  =  xwS(z)  -  y. 

Case  1.  R(z)  =  X.  Then  by  (11),  zRx,  xRz,  and  so  by  (10),  z  t  4>,  and 
xSz,  and  zSx.    Then  by  (12)  and  Thm.X.5.3,  x  =  iS(2).    So  R{z)  =  S{z). 

Case  2.     S{z)  =  x.    Then  by  similar  reasoning,  R{z)  =  aS(s). 

Case  3.  i^(2)  5^  x.,S(s)  9^  x.  Then  by  (13)  and  (14),  R{z)  =  t/  and 
S{z)  —  y,  and  hence  R(z)  =  S(z). 

So  in  all  cases  R(z)  =  S(z).  So  from  the  assumptions  a:  e  2,  and  RW(f).SW<j), 
we  can  infer  by  Thm.X.5.16  that  R  —  S.  So  from  the  single  assumption 
a  e  2,  we  get  W  e  Funct,  and  so  by  (4) 

(15)  17  e  1-1. 

Now  make  the  additional  assumption  0  e  Val(IF).    Then  RW<(),  and  so 
kvgiR)  =  /?  and  0  =   R"{x}.     Then  by  Thm.X.4.23,  0  =  ATg(R\{x}), 
whence  0  ^  /3  by  Thm.X.4.6,  Part  II,  and  Thm.X.4.2,  Part  I.    So  0  e  SC(/3) 
Thus 

(16)  Val(TF)  C  SC(/3). 
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Conversely,  make  the  additional  assumption  0  e  SC(/3),  whence  (j)  Q  13. 
Put 

R  =  ^](uv{u  €  cfy.v  =  .r.\.^  u  e(j).v  =  y)). 

Clearly 

(17)  R  e  Rel, 

(18)  Arg(/?)  =  /3, 

(19)  Yq\{R)  Q  a. 

Also  u  e  4>.uRv.  D  .V  =  X  and  ^^  u  e  4).uRv.  D  .v  =  y,  so  that  we  can  prove 
bv  cases  that  uRv.uRw.  D  .v  =  w.  Hence  R  e  Funct,  and  so  by  (18)  and 
(19) 

(20)  72e(/3A«)- 

Now  since  (p  Q  l3,  we  can  infer  successively  z  ecj).  D  .z  e^.z  etpyZ  ecj).  D  .zRx, 
z  ecp.  D  .xRz,  z  e<i).  3  .2  6  /^"{.x}.    Thus 

(21)  <^<^ir[x]. 

Likewise,  from  '^  2  e  4>.zRx.  D  .x  =  y,  we  get  x  9^  y-.  D  -.zRx.  D  .2  e  0.  • 
Thenby  (l),^/?.!-.  D  .z  ecj),  z  e  R"  {x}.  D  .z  e<i>.    Then  by  (21),  <A  =  ^"{^}, 
and  so  by  (20)  and  (3),  RW4>.     Then  </>  e  Val(TF),  and  we  have  shown 
SC(/3)  C  Val(T^).    Then  by  (16),  we  have  SC(/3)  =  Val(TF),  and  so  by  (15) 
and  (5),  (/3  A  «)  sm  SC(/3). 

Theorem  XI.1.29.     \-  {a,^,y):a  n  /3  =  A.  D  .((a  U  /S)  A  t)  sm  ((a  A  t)  X 

(^  At)). 

Proof.     Let 

(1)  If  =  (Rel  n  i?(Arg(i^)  =  a  W  |S))lXi?«ali2,/3l/2)). 
Then 

(2)  h  ^^'  t  Funct, 

and  \-  ArgiW)  =  Rel  n  R{Arg(R)  =  a  W  jS).    So 

(3)  h  ((a  U  ^-)  A  7)  ^  Arg(T7). 

Let  RWT  and  >STF7^.  Then  R,  S  e  Rel  and  Arg(i2)  =  Arg(>S)  =  a  W  /3 
and  {a]R,0]R)  =  T  =  {ot]S,l3]S).  Then  a]R  =  a]S  and  ^li2  =  ^]S.  Then 
by  Thm.X.4.11,  Part  III,  (a  VJ  (3)]R  =  (a]R)  W  {I3]R)  =  (a]S)  W  (I3]S)  = 
(a  \J  I3)]S.  So  by  Thm.X.4.10,  corollary,  Part  1,  R  =  (a  ^  I3)]R  = 
(a  yj  (3)]S  -  S.    We  infer  \-  W  e  Funct,  and  so  by  (2) 

(4)  \-  W  e  1-1. 

Assume  a  r\  jS  ^  A  and  S  e  {(a  /^  y)  X  (/3  A  t))-    Then  by  rule  C,  we 
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get  S  =  (T,U),  T  e  (a  /^  y),  and  U  e  (13  /^  y).  Then  T,  U  e  Funct, 
Arg(r)  -  a,  Arg(U)  -  /3,  Val(7^)  C  y,  Val(t/)  Q  y.  'Put  R  =  T  VJ  U. 
Then  by  Thm.X.5.21, 

(5)  R  e  Funct 

and  by  Ex.X.4.6,  Parts  (a)  and  (b),  with  Thm.IX.4.18,  Part  II, 

(6)  Arg(/^)  =  a\JI3, 

ysd(R)  Q  y. 
By  these  and  (5), 

(7)  7^6((«w/3)  At). 

By  the  definition  of  a\R,  we  have  a\R  ^  {a  y^  N)  r\  R  =  {a  Y.  V)  r\ 
(T  W  U)  =  ((a  X  V)  n  T)  W  ((«  X  V)  n  U)  =  {(x]T)  W  (a]U).  However, 
Aig(a]U)  =  a  n  Arg(C/)  =  a  r\  ^  =  A.  ^o  a]U  =  A.  Also,  by  Thm. 
X.4.10,  corollary,  Part  I,  a]T  =  T.    So 

(8)  a]R  =  T. 
In  a  similar  manner, 

(9)  ^1/2  =  U.       , 

So  by  (8)  and  (9),  ^  =  {a\R,(i\R).  Then  by  (5)  and  (6),  RWS.  From 
this  and  {!),  S  e  W'\{a  W  /3)  A  t)-    So 

(10)  h  «  ^  /5  =  A.  D  .((«  A  t)  X  (/3  A  t))  C  Tr'((a  n  /3)  A  7). 

Conversely,  let  S  e  W"{{a  W  ^)  A  t)-  Then  T^^FaS,  R  e  ((a  KJ  (3)  A  t). 
So  Arg(/^)  =  a  w  /:(,  ;S  =  (a]R,^]R),  R  e  Funct,  Val(/2)  C  7.  Then  by 
ThmX.5.7,  corollary,  a]R  e  Funct  and  I3]R  e  Funct.  Also  by  ThmX.4.9, 
Part  I,  and  Thm.IX.4.13,  Cor.  7,  Arg{a]R)  =  a  and  kvg{fi\R)  =  (3.  Also, 
by  Thm.X.4.6,  Part  I,  and  Thm.X.4.2,  Part  II,  Ya\ia]R)  C  y  and 
Val(/3li2)  C  7.  So  a]R  e  (a  /^  y),  (3]R  e  (/3  A  t)-  Then  S  e  ((a /^  y)  X 
(^At)).    So 

(11)  h  TF"((a  w  /3)  A  7)  ^  ((«  A  7)  X  (^  A  7)). 

By  (4),  (3),  (10),  (11),  and  Thm.XI.1.2,  our  theorem  follows. 
Theorem  XI.1.30.     h  {a,^,y,8):((3  A  «)  sm  USC(5).    D    .(y  A  5)   sm 

((7  X  iS)  A  «). 
Proof.     Let 

(1)  i?  e  l-l.Arg(/^)  =  (|S  A  «).Val(/2)  =  USC(5). 
Define 

(2)  17  =  (7  A  5)1(X^((7  X  |8)1(Xx2/((^({^(:r)}))(i/))))). 
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Then 

(3)  W  €  Funct, 

(4)  Arg(r)  =  (7  A  5)- 

Lemma  1.  If  C7i  and  U2  are  two  terms  which  contain  free  occurrences 
of  X  but  no  free  occurrences  of  y,  and  if  f/i  and  U2  are  stratified  so  that 
their  types  are  exactly  one  higher  than  the  type  of  x,  then 

h  {x):x  .  7.(7  X  m^xy{U,{y)))  =  (7  X  m^^y{'U2{y))).U,  6  (/3  A  «)■ 

Proof.  Assume 
(i)  X  ey, 

(ii)  (7  X  /3)1(Xa;2/(C/,(^)))  -  (7  X  /5)1(Xx2/(t/2(^))), 

(iii)  t/i6(/3A«), 

(iv)  C/2  e  (i8  A  «)• 

Then  by  (iii)  and  (iv),  Uy  e  Funct,  f/2  e  Funct,  Arg(C7i)  =  /3,  and 
Arg(C72)  =  /5.  Hence  we  use  Thm.X.5.16  to  prove  our  lemma,  to  which  end 
we  assume  y  e  /?.    Then  by  (i),  {x,y)  e  (7  X  /3),  and  so  by  Thm.X.5.28,  Part  I, 

{x,y){{y  X  /3)1(Xx2/(C/:(2/))))([/x(2/)). 
So  by  (ii), 

{x,y){{y  X  my^xy{U2{y)))){U,{y)). 

So  by  Thm.X.5.28,  Part  I,  Ux{y)  =  V^iy).  Then  our  lemma  follows  by 
Thm.X.5.16. 

Now,  to  prove  W  e  Funct,  we  assume  SiWT  and  S2WT.  Then  S-^  e  Funct, 
S2  e  Funct,  Arg(,SO  =  7  =  ^^^{S^),  Val(^i)  C  5,  Va^iS^)  C  5,  and 

(7  X  m^xy{{n{{S,{x)])){y)))  =  (7  X  ^)1(Xa;y((^({^2(:r) }))(?/)))  =  T. 

We  now  wish  to  prove  S^  =  S2  by  Thm.X.5.16,  and  so  assume  further  x  ey. 
Then  by  Thm.X.5.3,  Cor.  4,  S,{x)  e  Val(*Si)  and  aS2(x)  e  VaU^S^),  and  hence 
^i(a;)  €  5  and  ^^^(x)  e  5.  Then  by  Thm.IX.6.10,  Part  II,  {Sy{x)]  e  USC(5) 
and  [S2{x)\  eUSC(6).  Then  by  (1),  {S,{x)]  e  Val(i2)  and  {^^(a:)}  e  Val(i^). 
So  by  Thm.X.5.3,  Cor.  5,  R{{S,{x)])  e  Arg(/2)  and  R{{S2{x)])  e  A.Tg{R). 
Then  by  (1),  R{{S,{x)])  e  (/3  A  «)  and  R{{S2{x)])  e  (/3  A  «)•  Then  by 
Lemma  1,  R{{S,{x)])  =  7?({>S2(a;)}).  So,  as  {R{\S,{x)]))R{S,{x)]  and 
(^({>S2(a:)l))/^{>S2(a;)|  by  Thm.X.5.3,  Cor.  3,  we  get  {S,{x)]  =  {S2{x)}, 
whence  Si{x)  =  S2{x).  So  by  Thm.X.5.16,  we  conclude  S^  =  S2  from  the 
assumptions  SJVT  and  S2WT.    So  W  e  Funct,  and  by  (3)  we  get 

(5)  IT  €  1-1. 
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Assume 

(6)  r  6  ((7  X  ^)  A  «) 

and  define 

(7)  S  =  M{x  e  y.{u}  =  R{yz({x,y,z)  e  T))). 

Lemma  2.     (U,x):X  e  y.U  =  yz{{x,y,z)  e  T).  D  .U  e  {/S  /^  a). 
Proof.     Assume 


(i) 

X  ey, 

(ii) 

u  =  m{x,y,^)  6  T) 

Then  by  (6), 

(iii) 

T  e  Funct, 

(iv) 

Arg(r)  =  7  X  /?, 

(v) 

Val(T)  C  a. 

If  we  assume  yUz,  then  {x,y,z)  e  T,  so  that  by  (iii)  and  Thm.X.5.3, 

z  =  T{{x,y)).    So 

(vi)  U  e  Funct. 

If  ^^e  assume  yUz,  then  {{x,y))Tz,  so  that  5;  e  a  by  (v).    Hence 

(vii)  Val(C/)  C  a. 

If  we  assume  i/t/2;,  then  {{x,y))Tz,  so  that  (a:,?/)  e  7  X  /3  by  (iv),  so  that 
?/  e  iS.    Hence 

(viii)  Arg(C/)  C  ^; 

If  we  assume  y  e  fi,  then  {x,y)  e  7  X  jS  by  (i),  so  that  by  (iv)  and  rule  C, 
{{x,y))Tz,  and  so  yUz.    Hence 

(ix)  /3  C  Arg(t/). 

Our  lemma  follows  by  (vi),  (vii),  (viii),  and  (ix). 

Lemma  3.     S  e  {y  ^  b). 

Proof.     Let  xSu  and  a;/Sy.    Then 

[u]  =  R{yz{{x,y,z)  6  7^))  =  {v}. 

So  u  =  V.    Hence 

(i)  S  e  Funct. 

Let  re  e  7,    Then  define 

U  =  yK{x,y,z)  e  T). 
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Then  by  Lemma  2,  U  e  (0  A  «)•    Then  by  (1)  and  Thm.X.5.3,  Cor.  4, 
R(U)  €  USC(5).    So  by  rule  C  and  Thm.IX.6.10,  Part  III,  {u}  =  R(U) 

and  u  €  8.    So  by  (7),  xSu.    So  x  e  Arg(S).    Hence 

(ii)  7  ^  Arg(*St). 

Clearly,  xSu.  D  .x  e  y,  so  that 

(iii)  Arg(>S)  C  y. 

Ji  u  e  Val(AS),  then  by  rule  C,  xSu,  so  that  x  ey  and  {u]  =  R(yz({x,y,z)  e 
T)).    Define 

U  =  ijzi(x,y,z)  6  T). 

Then  {u}  =  R(U).  By  Lemma  2,  U  e  {(3  ^  a),  so  that  by  (1)  and  Thm. 
X.5.3,  Cor.  4,  R(U)  e  USC(5).  Hence  {u\  e  USC(5),  so  that  by  Thm. 
X.6.10,  Part  II,  u  e  8.    Hence 

(iv)  YaA(S)  C  5. 

Then  our  lemma  follows  by  (i),  (ii),  (iii),  and  (iv). 

Lemma  4.     {U,x):x  e  y.U  =  yz((x,y,z)  e  T).  D  .U  =  R({Six)}). 
Proof.     Assume 

(i)  X  ey, 

(ii)  U  =  yz((x,y,z)  6  T). 

Then  by  Lemma  2,  t/  e  (/3  A  «),  so  that  by  (1),  R{U)  e  USC(5).  Then  by 
rule  C  and  Thm.IX.6.10,  Part  III,  y  e  5  and  {?;}  =  RiU).  Then  by  (i)  and 
(7),  xSv.  So  since  S  e  Funct  by  Lemma  3,  we  get  v  =  S(x)  by  Thm.X.5.3. 
So 

(iii)  RiU)  =  {S(x)}. 

Since  C7  €  (|S  A  a),  we  have  by  (1),  (iii),  and  Thm.X.5.3, 

UR{S{x)}. 

Then  by  Thm.X.5.3,  Cor.  2,  we  get 

U  =  Ri{S(x)}). 

Lemma  5.     T  =  (y  X  /3)l(X.r2/((A^({,S(.T)}))(7/))). 

Proof.     First  assume  uTz.    Then,  since  Arg(T')  =  7  X  /3  by  (6),  we  get 
M  e  7  X  iS.    So  by  rule  C,  u  —  (x,y),  x  e  y,  y  e  13.    Define 

U  =  m(x,y,z)  6  T). 

Then  yUz,  and  so  since  U  e  Funct  by  Lemma  2,  we  get  z  =  U(y)  by  Thm. 
X.5.3.  SobyLemma4,s=  (^({*S(.x-) }))(?/).  Then  by  Thm.X.5.2S,  Part  I, 
{x,y)(Xxy((R{{S(x)}))(y)))z.    As  w  =  {x,y)  and  u  ey  X  (3,  we  have 
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So 
(i)  TQ(yX  m^^y((R({S(x)]))(y))). 

Conversely,  assume 

u((y  X  m^xyaR({S(x)}))(y))))z. 

Then  u  €  y  X  (3  and  u{\xy((R({S{x)]))(y)))z.    So  by  rule  C,  u  =  (x,y), 
xey,ye^.    So  by  Thm.X.5.28,  Part  I,  z  =  (R{{S{x)]))(y).    If  we  define 

U  =  m{x,y,z)  6  T), 

then  by  Lemma  4,  z  =  U{y).  However,  by  Lemma  2,  Arg(t/)  =  |8,  so  that 
y  €  Arg(C/).    Hence  by  Thm.X.5.3,  yUz.    So  (a:,?/,^)  e  T.    That  is,  wT^^.    So 

(ii)  (t  X  ^)]{\xy{{R{[S{x)])){y)))  C  T. 

Our  lemma  now  follows  by  (i)  and  (ii) . 
By  Lemma  3, 

>S  e  (7  A  5), 
and  by  Lemma  5, 

T  =  {y  X  m{^xy{{n{{S{x)])){y))), 
Then  by  (2),  SWT,  so  that  T  e  Val(T'F).    Hence 

(8)  ((7  X  ^)  A  «)  C  Val(Tr). 
Conversely,  assume  T  e  Val(TF).    Then  SWT.    So 

(9)  >S  6  (7  A  5), 

(10)  r  =  (7  X  /3)1(Xa:2/((^({^(a:)|))(2/))). 
Then  by  Thm.X.5.28,  Part  III, 

(11)  T  e  Funct. 
Also  by  Thm.X.5.28,  Part  V, 

(12)  Arg(r)  =  7  X  /3. 

Further  assume  z  e  Val(T').    Then  by  rule  C,  uTz,  and  so  w  e  7  X  /3  and 

u{\xy{{R{{S{x)])){y)))z.  So  by  rule  C,  w  =  (a:,^),  x  e  7,  2/  e  /?.  So 
2  =  (^({5(x) }))(?/).  By  a:  e  7  and  (9),  we  have  x  e  Arg(*S).  So  by  (9)  and 
Thm.X.5.3,  Cor.  4,  S{x)  e  Val(>S).  So  by  (9),  S{x)  e  5.  So  {S{x)  ]  e  USC(5). 
Then  by  (1)  and  Thm.X.5.3,  Cor.  5,  R{[S{x)])  e  Arg(7?),  and  by  (1), 
R{\S{x)])  e{^  ^a).  Then,  since  ?/e^,  we  have  ye  Arg(X({>S(x)})).  Then 
{R{\8{x)\)){y)  e\2l{R{{S{x)\)).    ^o  {R{[S{x)\)){y)  e  a.    So  0  e  a.    Then 

(il3)  Val(T)  C  a. 
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So  by  (11),  (12),  and  (13),  T  e  ((t  X  /3)  A  «)•    Hence 
(14)  Ya\(W)  C  ((7  X  13)  A  «)• 

Our  theorem  now  follows  by  (5),  (4),  (8),  and  (14). 

Theorem  XI.1.31.     \-  (a,0,y):a  C  ^.  D  .(7  A  «)  ^  (t  A  i^). 

Proof.  Let  a  Q  13  and  let  i?  e  (7  A  «)•  Then  R  e  Funct,  Arg(i?)  =  7, 
Va\(R)  C  a.    Then  Val(/2)  g  /3. 

Theorem  XI.1.32.     h  («,/3):a  5^  A.  D  .(/3  A  «)  ^^  A. 

Proof.  Assume  a  ?^  A.  Then  x  e  a,  so  that  {x}  C  a.  Now  clearly 
u(l3  X  {a;})y.  D  .v  =  x,  so  that 

(1)  (8  X  {a;}  e  Funct. 
Also,  {x]  7^  A,  so  that  by  Thm.X.4.4,  Part  I, 

(2)  Arg(/3  X  [x\)  =  /3. 

If  /3  =  A,  then  by  Thm.X.2.12,  Cor.  3,  /3  X  {a:}  =  A,  so  that  by  Thm. 
X.4.3,  Part  II,  Val(/3  X  {x})  =  A,  so  that  by  Thm.IX.4.13,  Cor.  5, 
Val(/3  X  {x})  Qa.  If  )S  F^  A,  then  by  Thm.X.4.4,  Part  II,  Val(/3  X  {a:})  = 
{x],  so  that  in  this  case  also  Val(/3  X  {a;})  C  a.    So 

(3)  Val(^  X  {x})  C  a. 

Then  ^  X  {a;}  e  (/3  A  «)• 

We  now  close  the  section  with  some  miscellaneous  results. 
♦Theorem  XI.1.33.     \-  {a,^):a  sm  /3.  =  .USC(a)  sm  USC(^). 
Proof.     By  Thm.X.5.19,  Cor.  9,  and  Thm.X.4.29,  we  readily  verify  that 

h  {a,B,R,S):a  sm«  ^.S  =  RUSC(72).  D  .USC(a)  sm^  USC(/3). 

So 

(1)  h  («,^):«  sm  ^.  D  .USC(o;)  sm  USC(/5). 

Conversely,  assume  USC(q;)  sm  USC(/3).  Then  by  rule  C,  S  e  1-1, 
USC(a)  =  Arg(>S),  USC(^)  =  Val(5).  Put  R  =  xy{{x}S{y]).  Then  one 
can  prove  a  sm^e  /3  without  difficulty. 

Corollary.     L  (a).Can(Q:)  =  Can(USC(a)). 

Proof.     Put  USC(q:)  for  j8  in  the  theorem. 

Theorem  XI.1.34.     \-  {a,^):C&n{a).a  sm  ^.  D  .Can(/3). 

Proo/.  Assume  Can(a:)  and  a  sm  /3.  Then  a  sm  USC(a)  and  USC(q:)  sm 
USC(,8).  So  13  sm  a,  a  sm  USC(a),  USC(o;)  sm  USC(/3),  and  hence 
0  sm  USC(/3). 

Theorem  XL1.35.     h  (ct).USC(SC(a:))  sm  SC(USC(a)). 

Proo/.     Take 

W  =  xy(Ez).x  =  {z}.y  =  USC(z). 
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One  proves  easily 

(1)  h  W  e  1-1, 

(2)  |-USC(SC(a))  e  Arg(T^). 

Let  ?/ €  SC(USC(a)).  Then  ?/ C  USC(a).  So  by  Thm.IX.6.14,  Cor.  2, 
and  rule  C,  z  ^  a.y  =  USC(2).  Hence  z  e  SC(q;),  {z]Wy.  Hence  {z\  e 
use (SC (a)). {2}  IF?/.    Thus  y  e  TF 'USCCSCC^)).    So  we  have  shown 

(3)  h  SC(USC(a))  C  TF"USC(SC(a)). 

Conversely,  let  y  e  PI'*'USC(SC(a)).  Then  by  rule  C,  xWy. 
X  e  USC(SC(a;)).  So  by  rule  C,  x  ^  [z],  V  =  USC(2),  and  x  =  [iv], 
w  e  SC(a).  So  2  =  10,  and  y  =  USC(0).2  C  «.  Hence  by  Thm.IX.6.12, 
Part  V,  y  C  USC(«).    So  y  e  SC(USC(«)).    So 

h  TF"USC(SC(a))  C  SC(USC(a)). 

Then  by  (1),  (2),  and  (3),  our  theorem  follows  by  Thm.XI.1.2. 

Theorem  XI.1.36.     \-  {a,0):a  sm  /3.  D  .SC(a)  sm  SC(/3). 

Proof.  Let  A  =  {A,V}.  Then  by  Thm.IX.4.6,  corollary,  and  Thm. 
X.1.16,  Cor.  l,V  A  e  2.  Then  by  Thm.XLl.28,  \-  SC(a)  sm  (a  A  A), 
\r  SC(,S)  sm  (/3  /V  A).    However,  by  Thm.XLl.22,  [-  a  sm  /3.  D  .(a  A  A)  sm 

Theorem  XI.1.37.     \-  (a):Can(a:).  D  .Can(SC(a)). 

Proo/.  Assume  Can  (a).  That  is,  a  sm  USC(a).  Then  by  Thm. XL  1.36, 
SC(a)  sm  SC(USC(a)).  Then  by  Thm.XLl.35,  SC(a)  sm  USC(SC(a)). 
That  is,  Can(SC(a)). 

Theorem  XLL38.     \-  {a,x):{{x]  A  «)  sm  USC(a). 

Proof.  By  Thm.XL1.17  and  Thm.XL1.20,  \-  {{x]  X  a)  sm  a.  So  by 
Thm.XLl.33,  |-  USC(fa;}  X  a)  sm  USC(a).  Then  our  theorem  follows  by 
Thm.XI.1.26. 

Theorem  XI.1.39.     j-  (a,^):Can(a).Can(^).a  n  /3  =  A.  D  .Can(a  \J  ^). 

Proof.  Assume  Can(a:),  Can(/3),  a  n  /3  =  A.  Then  by  Thm.IX.6.12, 
Part  I,  USC(o;)  n  USC(/3)  =  USC(A).  Then  by  Thm.IX.6.12,  Part  IV, 
use  (a)  r\  USC(^)  =  A.  Also  we  have  a  sm  USC(«)  and  /3  sm  USC(/3). 
So  by  Thm.XL1.12,  corollary,  (a  W  ^)  sm  (USC(a)  W  USC(/3)).  Then  by 
Thm.IX.6.12,  Part  II,  (a  w  ^)  sm  USC(q:  W  /?).    That  is,  Can(Q:  W  ^S). 

Theorem  XI.1.40.     \-  {R):R  e  Rel.  D  .RUSC(i^)  sm  USC(i^). 

Proof.     Define 

W  =  M(Ex,y).u  =  {{x],{y}).v  =  {{x,y)]. 
Clearly 
(1)  [-We  1-1. 
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Let  R  e  Rel  and  u  e  RUSC(/^).  By  Thm.X.3.14,  RUSC(/2)  e  Rel,  and  so 
by  Thm.X.3.1,  u  =  {w,z),  w(RV^C{R))z.  Then  w  =  {x],z  =  [y],  xRy. 
Sow  =  <{x}, {?/}>.    BouW{{x,y)].    SoweArg(TF).    Hence 

(2)  VRe  Rel.  D  .RUSC(i?)  C  Arg(T7). 

Let  R  e  Rel  and  v  e  USC(i?).  Then  2  «  /2  and  i'  =  {s}.  Then  z  =  {x,y), 
xRy.  So  V  =  {{x,y)}  and  {a;} (RUSC(/^) ){?/}.  Then  {{x},{y})  e  RVSC{R) 
&nd{{x},{y})Wv.    Thus  z;  e  Tf "RUSC(i2).    So 

(3)  \-R€  Rel.  D  .VSC{R)  Q  TF"RUSC(i2). 

Let  V  e  T7"RUSC(i2).  Then  wTF?;  and  u  e  RVSC(R).  So  u  =  ({x],{y}) 
and  V  =  {{x,ij)].  So  {x} (mJSC(R)){y].  Then  a;i2?/,  and  so  (x,y)  e  R  and 
so  {(x,y)}  e  VSC(R).    Thus  v  e  USC(/^).    So 

(4)  h  T7"RUSC(i2)  C  USC(i2). 

Theorem  XL1.41.     \-  (a,/3).USC(a  X  /?)  sm  (USC(a)  X  USC(/3)). 

Proof.     Use  Thm.X.3.19  and  Thm.XLL40. 

Theorem  XL1.42.     \-  (a,^):Can(a).Can(/3).  D  .Can(a  X  ^). 

Proo/.  Let  Can  (a)  and  Can(^).  Then  a  sm  USC(a)  and  (3  sm  USC(/3). 
So  by  Thm.XLl.18,  Part  II,  (a  X  /3)  sm  (USC(a)  X  USC(i8)).  Then  by 
Thm.XI.1.41,  (a  X  13)  sm  (USC(«  X  /5)).    That  is,  Can(a  X  /3). 

Theorem  XL1.43.  \-  («,^,R):R  e  (a  A  /3).  =  .RUSC(R)  e  (USC(a)  A 
USC(^)). 

Proo/.  Let  R  e  Rel.  If  now  R  e{a  /^  ^),  then  P  e  Funct,  Arg(P)  =  a, 
and  Val(P)  C  /3.  By  Thm.X.5.17,  RUSC(P)  e  Funct.  By  Thm.X.4.29, 
Part  I,  Arg(RUSC(P))  =  USC(a).  By  Thm.X.4.29,  Part  II,  and  Thm. 
IX.6.12,  Part  V,  Val(RUSC(P))  Q  USC(^).  So  RUSC(P)  e  (USC(«)  A 
USC(/3)).  Conversely,  if  RUSC(P)  e  (USC(a)  A  USC(/3)),  then  by  the 
same  theorems,  P  e  (a  A  j^)- 

Theorem  XL1.44.     h  («,/3).USC(a  A  /3)  sm  (USC(a)  A  USC(/5)). 

Proof.     Define 

IF  =  uv(ER).R  €  Rel.w  =  {P}.y  =  RUSC(P). 
By  Thm.IX.6.3  and  Thm.X.3.17, 

(1)  [-We  1-L 

LetwaUSC(aA/3)-ThenM=  {P}  andP  e  (a  Ai8).  So  wTF(RUSC(P)). 
Hence  u  e  Arg(TF).    So 

(2)  h  USC(«  A^)  ^  Arg(IF). 

Let  S  e  (USC(a)  A  USC(/3)).  So  ^  e  Funct.Arg(5)  =  USC(a).Val(^)  Q 
USC(/3).    Define 

R  =  xyi{x}S{y}), 
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Then 

(3)  (Eu,v).uRv.x  =  {u}.y  =  {v}-.  D  :xSy. 

If  xSy,  then  x  e  Arg(*S),  y  e  Ysd(S).  So  x  e  USC(«),  y  e  USC(/3).  So 
X  =  {u},  u  e  a,  and  y  =  {v\,  V  e  ^.    ^o  {u\S\v}.    Then  uRv.    So 

(4)  xSy:  D  :{'Eu,v).uRv.x  =  {u}.y  =  {v}. 

Then  by  (3),  (4),  and  Thm.X.3.15,  (x,y):xSy.  ^  .x(RV&CiR))y.  So 
S  =  RVSC(R).  Thus  RVSC(R)  e  (USC(a)  A  USC(/3)).  Then  by  Thm. 
XI. 1.43,  Re  (a  A  /3)-  So  {7^}  e  USC(a  A  /5).  Also,  since  aS  =  RUSC(72), 
we  have  {R}WS.    So  S  e  Tr'USC(a  /V  ^).    Thus 

(5)  ,  h  (USC(a)  A  USC(/3))  C  Pr'USC(«  A  /3). 

Conversely,  let  >S  e  TF"USC(q:  A  /5)-  Then  uWS  and  w  e  USC(q;  A  /3). 
So  i?  €  Rel,  u  =  {R},  S  =  RVSC(R).  Then  {R}  e  USC(a  A  /3).  So  i?  e 
(«  A  /3),  and  by  Thm.XI.1.43,  RVSC{R)  e  (USC(a)  A  rSC(/3)).  That  is, 
6'  6  (use (a)  A  USC(^)).    Hence 

(6)  h  TF"USC(a  A  /3)  C  (USC(a)  A  USC(^)). 

Our  theorem  now  follows  by  (1),  (2),  (5),  and  (6). 

Theorem  XI.1.45.     h  K/3):Can(a).Can(^).  D  .Can(«  A /5). 

Proof.  Assume  Can  (a)  and  Can  (,8).  Then  a  sm  USC(a),  (3  sm  USC(/S). 
So  by  Thm.XI.1.23,  corollary,  (a  A  /3)  sm  (USC(a)  A  USC(^)).  Then  by 
Thm.XI.1.44,  {a  A  ^)  sm  USC(«  A  /8)-    That  is,  Can(a  A  iS). 

EXERCISES 

Xl.1.1.     Prove  h  («,iS):«,/3  e  2.  D  .a  sm  /3. 

XI.1.2.  Prove  h  («,^,/?,'S):«  sm^  0.S  =  SCia)]CKx(R''x)).  D  .SC(«) 
sms  SC(/5). 

XI.1.3.  Prove  [-  (a,^):a  n  ,8  =  A.  D  .SC(q:  W  /3)  sm  (SC(a)  X  SC(/3)). 
(i7m^.     Take  y  to  be  {A,V}  in  Thm.XI.1.29  and  use  Thm.XI.1.28.) 

XI.1.4.  Prove  |-  {a,l3,R):a  n  ^  =  A.R  =  SC(a  W  ^)]\x{{a  r\  X,I3  n  x)). 
D  .SC(a  W  /3)  sm^  (SC(a)  X  SC(^)). 

XI.1.5.     Prove  \~  ~Can(l). 

XI.1.6.     Give  the  details  of  the  proof  of  Thm.XI.1.23. 

2.  Elementary  Properties  of  Cardinal  Numbers.  Since  a  sm  /3  expresses 
the  fact  that  a  and  13  have  the  same  number  of  members,  we  can  define  the 
number  of  members  of  a  by  abstraction  with  respect  to  the  equivalence 
relation  sm.  Accordingly,  we  now  define  the  cardinal  number  Nc(a)  of  a 
class  a  by  abstraction  with  respect  to  sm,  namely, 

Nc(a)         for         sm'*{a}. 
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Thus  Nc(a)  is  the  equivalence  class  of  a  relative  to  sm.  We  define  the 
set  of  cardinal  numbers  as  the  set  of  equivalence  classes  with  respect  to  sm. 

namely, 

NC         for         EqC(sm). 

If  q:  is  a  variable,  then  the  indicated  occurrence  of  a  is  the  only  free 
occurrence  of  a  variable  in  Nc(a;).  Also  Nc(A)  is  stratified  if  and  only  if 
A  is  stratified,  and  if  stratified  is  one  type  higher  than  A.  The  term  NC 
is  stratified  and  has  no  free  variables,  and  so  may  be  assigned  any  type. 

By  Thm.X.4.22,  corollary,  we  infer  the  following  theorem. 
^Theorem  XI.2.1.     \-  (a,/3):^  e  Nc(a).  =  .a  sm  /3. 
♦Corollary  1.     [- (q:).q:  e  Nc(a;). 

Corollary  2.     \-  {a,^):a  e  Nc(/3).  -  .13  e  Nc(a;). 

Corollary  3.     |-  {a,^,y):a,l3  e  Nc(7).  ^  .«  sm  /3. 

Corollary  4.     \-  (cx,l3,y):a  e  NcCt)."  sm  /3.  D  ./3  e  Nc(7). 

By  combining  Thm.XI.1.3,  Cor.  1,  and  Thm.XI.1.5,  Cor.  3,  with  the 
theorems  of  Sec.  7  of  Chapter  X,  we  infer  the  following  theorem. 

Theorem  XI.2.2. 

I.  h  (n):n  e  NC.  =  .(Ea).n  =  Nc(a). 
II.  1- (a,/3):a  sm /3.  =  .Nc(a)  =  Nc(/3). 

III.  \-  {n):.n  e  NC:  D  :{a):a  e  n.  =  .n  =  Nc(q;). 

IV.  h  (m,n):m,n  e  NC.w  n  n  9^  A.  D  .m  =  n. 
V.  \-  (a)(Ein).a  e  n.n  €  NC. 

VI.  h  {n):n  €  NC.  D  .n  5^  A. 

Corollary  1.     \-  (a).Nc(a)  e  NC. 

Corollary  2.     \-  (a).Nc(a)  ?^  A. 

Corollary  3.     |-  {a,l3):^c(,a)  r\  Nc(/3)  5^  A.  =  .Nc(a)  =  Nc(/S). 

Corollary  4.     [-  («,/3):Nc(a)  n  Nc(/3)  5^  A.  =  .a  sm  /3. 

Theorem  XI.2.3. 
*I.  \-  (a,l3,n):n  e  NC.a:,/3  e  n.  D  .a  sm  /3. 
*II.  \-  (a,^,n):n  e  NC.a  €  n.a  sm  (3.  D  .^  e  n. 

Proof  of  I.  Assume  n  e  NC  and  a,/3  «  n.  Then  by  Thm.XI.2.2,  Part  I, 
and  rule  C,  n  =  Nc(7).    Then  our  result  follows  by  Thm.XI.2.1,  Cor.  3. 

Proof  of  II.     Similar. 

Theorem  XI.2.4.     h  («).Nc(USC(a))  9^  Nc(SC(a)). 

Proof.     Use  Thm.XI.L6  and  Thm.XI.2.2,  Part  II. 

Theorem  XL2.5.     [- («):Can(a).  =  .Nc(a)  -  Nc(USC(q:)). 

Proof.     Use  the  definition  of  Can(a)  with  Thm.XI.2.2,  Part  II. 

Theorem  XI.2.6.     [-  (a):Can(a).  D  .Nc(a)  5^  Nc(SC(a)). 

Proof.     Use  Thm.XI.1.7. 

Theorem  XI.2.7.     [- 0  =  Nc(A). 

Proof.  By  Thm.XI.2.1,  h  /S  €  Nc(A).  =  ./3  sm  A.  So  by  Thm.XI.l  9. 
corollary,  \-  /3  e  Nc(A).  =  .^  eO. 
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Corollary  1.  [-  0  e  NC. 

Corollary  2.  [-  (a):Nc(a)  =  0.  =  .a  =  A. 

Corollary  3.  |-  (a) .-A  e  Nc(q!).  ^  .a  =  A. 

Corollary  4.  [-  (n):.n  e  NC:  D   :A  e  n.  =  .n  =  0. 

7\^ote.     In  Cors.  2  and  3,  we  use  Thm.XI.1.9. 

Theorem  XI.2.8.     h  (a^)-l  =  Nc({a;}). 

Proof.     Use  Thm.XI.2.1  and  Thm.XI.1.11. 

Corollary  1.     h  1  =  Nc(0). 

Corollary  2.     [-  1  ^  NC. 

Corollary  3.     |-  {a,x):{x}  e  Nc(q:).  ^  .a  e  1. 

Corollary  4.     |-  {n,x):.n  e  NC:  3  :{.t}  e  n.  =  .n  =  1. 

Theorem  XI.2.9.     h  K^):«  n  /3  =  A.  D  .Nc(a)  +  Nc(/3)  =  Nc(a  W  /3). 

Proo/.  Let  a  n  /3  =  A.  If  7  e  Nc(a)  +  Nc(^),  then  by  Thm.X.1.1, 
ct>  e  Nc(a).^  e  Nc(/3).</)  n  ^  =  A.</)  W  ^  -  7.  Then  by  Thm.XI.2.1,  a  sm  </> 
and  ,3  sm  ^.  So  by  Thm.XI.1.12,  corollary,  (0  W  0)  sm  («  W  /3).  That  is, 
7  sm  (a  W  /3).    So  7  e  Nc(a  W  ^S)  by  Thm.XI.2.1.    So 

(1)  Nc(a)  +  Nc(/5)  Q  Nc(a  W  /3). 
Conversely,  if  7  e  Nc(q;  W  /3),  then  (a  W  /3)  sm  7.    Then 

(2)  P  €  1-1, 

aVJiS  ^  Arg(P)  and  7  =  Val(P).    So  by  Thm.IX.4.13,  Cor.  7,  «  C  Arg(P) 
and  /3  C  Arg(P).     Hence  by   (2)    and   Thm.XI.1.2,   a  sm   (P"a)   and 
/3  sm  (P"/3). 
Thus 

(3)  (P"«)  e  Nc(a) 

(4)  (P"^)  e  Nc(^). 

Now  suppose  (R''a)  r\  (P"/3)  ?^  A.  Then  y  e  R"a  and  y  e  P"/3.  Then 
a;P|/..r  e  a  and  sPt/.s  e  (3.  So  by  (2),  x  =  z,  and  so  a:  e  /?.  Hence  a;  e  o:  n  ;S, 
contrary  to  assumption.    Thus 

(5)  (P' a)  n  (P"^)  =  A. 

By  Ex.X.4.6,  (P"«)  W  (P"/5)  =  P"(«  W  /3).  Then  by  Thm.X.4.26, 
Part  I, 

(6)  7  =  (P"«)  W  (P"/3). 

Then  by  (3),  (4),  (5),  and  (6),  7  e  Nc(a)  +  Nc(/3).  So  by  (1),  Nc(a)  + 
Nc(^)  -  Nc(q:  \J  13). 

We  now  wish  to  prove  that,  if  m  and  n  are  cardinal  numbers,  then  so  is 
m  -\r  n.  The  critical  point  is  that,  if  a  is  a  class  with  m  members  and  /3  is  a 
class  with  n  members,  then  a  and  /3  may  overlap,  and  so  a  w  /3  may  have 
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fewer  than  m  -{-  n  members.  Thus  there  seems  no  immediate  way  to  get  a 
class  with  m  +  n  members,  and  we  must  produce  such  a  class  to  prove  that 
m  +  n  is  a  cardinal  number.  This  difficulty  is  avoided  by  noting  that 
a  X  0  has  as  many  members  as  a,  namely,  to,  and  /?  X  {V}  has  as  many 
members  as  ^3,  namely,  n,  and  a  X  0  and  /3  X  {V}  do  not  overlap,  so  that 
(a  X  0)  \J  (13  X  {V})  has  7n  -{-  n  members. 
^Theorem  XI.2.10.  \-  im,n):m,n  e  NC.  D  jn  -\-  n  e  NC. 
Proof.     Let  m,n  e  NC.    Then 

(1)  TO  -  Nc(a:), 

(2)  n  =  Nc(^). 

Now,  by  Thm.XI.1.20,  a  sm  (a  X  0)  and  /3  ,sm  {(3  X  {V}).  So  by  (1), 
(2),  and  Thm.XI.2.2,  Part  II, 

(3)  TO  =  Nc(a  X  0), 

(4)  n  =  Nc(/3  X  {V}). 

Suppose  (a  X  0)  n  (/3  X  {V|)  ^  A.  Then  w  e  a  X  0  and  m  €  /3  X  {V}. 
So  u  =  {x,y).x  €  a.y  e  0  and  u  =  {w,v).w  e  /3.y  e  { V| .  Hence  y  =  A,  y  =  v, 
and  V  =  Y.    This  contradicts  Thm.IX.4.6,  corollary.    So 

(5)  (a  X  0)  n  (/3  X  {V})  =  A. 
So  by  (3),  (4),  (5),  and  Thm.XI.2.9, 

TO  +  n  =  Nc((a  X  0)  W  (/3  X  {V})). 

Then  by  Thm.XI.2.2,  Part  I,  to  +  n  e  NC. 
Corollary.     |-  {n):n  e  NC.  D  .n  +  1  e  NC. 
*Theorem  XI.2.11.     h  Nn  e  NC. 
Proof.     We  shall  prove 

|-  (n):n  e  Nn.  D  .n  e  NC 

b}^  "induction"  on  n.  That  is,  we  use  Thm.X.1.13,  taking  F(n)  to  be 
n  €  NC.  By  Thm.XI.2.7,  Cor.  1,  we  have  |-  F(0).  By  Thm.XI.2.10, 
corollary, 

\-  (n):n  e  Nn./'Xn).  D  .F(n  +  1). 
Hence 

\-  (n):n  €  Nn.  D  .F(n), 

which  is  our  theorem. 
Corollary  1.     ^2  e  NC. 
Corollary  2.     \-3  e  NC. 
etc. 
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Thus  we  infer  that  every  nonnegative  integer  is  a  cardinal  number. 
Indeed  the  nonnegative  integers  are  just  the  finite  cardinal  numbers,  but 
we  shall  go  into  this  question  later. 

Theorem  XI.2.12.     \-  {m,n):m,n  e  NC.w  +  1  =  n  +  1.  D  .m  =  n. 

Proof.  Assume  m,n  e  NC  and  m  -\-  1  =  n  -\-  1.  Then  by  Thm.XI.2.10, 
corollary,  m  +  1  e  NC.  So  by  rule  C,  w  +  1  =  NC(7)  =  n  +  1,  So 
7  e  m  +  1  and  7  e  n  +  1.    Then  by  Thm.X.1.16  and  rule  C 

6  e  w.'~  X  e  9.6  ^  {x}   =7. 
(f)  e  n.'^  y  e  (fi.cf)  W  {?/}   =7. 

So  by  Thm.XI.1.13,  6  sm  0.  Now  from  d  em.  and  </>  e  n,  we  get  m  =  'Ne(d) 
and  n  =  Nc(0)  by  Thm.XI.2.2,  Part  III.  Also,  from  6  sm  <f),  we  get 
Nc(^)  =  Nc(0)  by  Thm.XI.2.2,  Part  II.    Thus  m  =  n. 

Corollary.     [-  {m,n):m,n  e  Nn.m  -{-  1  =  n  -\-  1.  D  .m  =  n. 

This  corollary  is  just  the  result  assumed  in  Axiom  scheme  13.  Of  course, 
we  have  used  Axiom  scheme  13  in  the  proof,  but  indirectly  and  only  through 
Thms.X.2.7,  X.2.8,  and  X.2.9.  Hence,  if  these  theorems  can  somehow  be 
proved  without  the  use  of  Axiom  scheme  13,  then  we  can  dispense  with 
Axiom  scheme  13,  since  in  the  corollary  just  proved  we  have  derived  the 
result  stated  in  Axiom  scheme  13. 

We  now  introduce  the  relations  of  greater  and  less  between  cardinal 
numbers.    We  define 


<c 

for 

niri(Ea,^).a  e  m./3  e  n.a  C  /3, 

>c 

for 

Cnv(<.). 

<c 

for 

<c     -     /. 

>c 

for 

Cnv(<,).       • 

^c,  ^c,  <c,  and  >c  are  all  stratified  and  contain  no  free  variables  and  so 
may  be  assigned  any  type. 

In  most  cases,  it  will  be  clear  from  the  context  that  we  are  dealing  with 
cardinal  numbers,  and  in  such  cases  we  shall  omit  the  subscript  c  and  write 
merely  <,>,<,  or  >. 

We  commonly  write  7n  <  n  <  p  for  m  <  n.n  ^  p,m  <  n  <.p  for  m  <  n. 
n  <  p,  m  —  n  <  p  for  m  =  n.n  <  p,  etc. 

Theorem  XL2.13. 
*I.  |-  {in,n):m  <  n.  =  .(Ea,^).a  e  m.fi  e  n.a  C  j3. 
II.  |-  (m,n):m  >  n.  ^  .n  <  m. 

III.  \-  (m,n):m  <  n.  =  .m  <  n.m  ^  n. 

IV.  \-  {m,n):m  >  n.  ^  .n  <  m. 

Theorem  XI.2.14.     \-  {n)'.n  5^  A.  D  .n  <  n. 
Proof.     Use  Thm.IX.4.13,  Cor.  3. 
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Corollary  1.     |-  (n):w  e  NC.  D  .n  <  n. 
*Corollary  2.     |-  {m,n):.m,n  e  NC:  3  -.m  <  n.  =  .m  <  n.y.m  =  n. 

Corollary  3.     (-  (7n,n):.m,n  e  NC:  D  -.m  >  n.  =  .m  >  n.v.m  =  n. 

Proof  of  Corollaries  2  and  3.  Use  proof  by  cases  with  the  cases  m  =  n 
and  m  9^  n. 

Theorem  XI.2.15.     \-  {n):n  ^  A.  D  .0  <  n. 

Proof.     Use  Thm.IX.4.13,  Cor.  5. 

Corollary  1.     [-  (w):n  9^  A.n  f^  0.  3  .0  <  n. 

Corollary  2.     h  in):n  e  NC.  D  .0  <  71. 

Corollary  3.     [-  (n):7i  e  NC.ti  f^  0.  D  .0  <  w. 

Theorem  XI.2.16.     h  (w):^^  ?^  A.  3  .n  <  Nc(V). 

Proof.     Use  Thm.IX.4.13,  Cor.  4. 

CoroUary.     ^  {n):n  e  NC.  D  .n  <  Nc(V). 

Thus  we  see  that  there  is  a  least  cardinal  number,  namely,  0,  and  a 
greatest  cardinal  number,  namely,  Nc(V).  In  intuitive  mathematics,  the 
latter  result  contradicts  an  alternative  form  of  Cantor's  theorem  in  the 
form  (q:).Nc(q;)  <  Nc(SC(a)).  Because  of  our  stratification  requirements, 
we  only  know  how  to  prove  this  latter  result  with  the  hypothesis  Can  (a) 
and  thus  have  no  contradiction  with  the  result  that  Nc(V)  is  the  greatest 
cardinal. 

Theorem  XI.2.17.     h  («).Nc(USC(«))  <  Nc(SC(«)). 

Proof.  By  Thm.XI.2.1,  Cor.  1,  \-  USC(a)  e  Nc(USC(a))  and  \-  SC(a)  e 
Nc(SC(a)).  Also  by  Thm.IX.6.18,  Cor.  1,  [-  USC(a)  C  SC(«).  So 
[-  Nc(USC(a))  <  Nc(SC(a)).    Our  theorem  now  follows  by  Thm.XI.2.4. 

Theorem  XI.2.18.     \- (a):Can{a).  D  .Nc(a;)  <  Nc(SC(q:)). 

Proof.     Combine  Thm.XI.2.5  with  Thm.XI.2.17. 

Theorem  XI. 2. 19.     [-  (m,n):m  -\-n^A.  D.'m<m-\-  n.n  <  m  -\-  n. 

Proof.  Let  m  -\-  n  9^  A.  Then  by  rule  C,  a  e  w  +  n.  So  by  rule  C  and 
Thm.X.1.1,  ^  €  m.7  e  n./3  n  7  =  A./3  W  7  =  a.  So  by  Thm.IX.4.13,  Cor.  7, 
/3  e  m.a  e  m  +  n.^  C  «.    So  m  <  w  +  n.    Similarly,  n  <  m  -\-  71. 

Corollary.     |-  {m,7i)i7n,n  e  NC.  D  .m  <  m  +  n.7i  <  in  +  n. 

We  now  prove  one  of  the  most  important  properties  of  < ,  namely,  the 
version  of  the  Schroder-Bernstein  theorem  which  is  stated  in  terms  of 
inequality  of  cardinal  numbers. 
**Theorem  XI.2.20.     |-  {711,71)  :m,n  e  NC.m  <  n.n  <  m.  D  jn  =  n. 

Proof.     Assume  the  hypothesis.    Then  by  rule  C 

(1)  7  e  m.(S  e  n,y  ^  /3, 

(2)  5  e  n.a  e  7n.8  C  a. 

Then  by  Thm.XI.2.3,  Part  I,  a  sm  7  and  /3  sm  8.  So  by  Thm.XI.1.15, 
a  sm  (3.  So  by  Thm.XI.2.3,  Part  II,  /3  e  m.  Hence  7n  r\  n  9^  A.  So  by 
Thm.XI.2.2,  Part  IV,  m  =  n. 
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Corollary  1.  |-  {m,n):m,n  e  NC.w  <  n.  1)  .'^(n  <  m). 
Corollary  2.  \-  {m,n):m,n  e  NC.n  <  m.  D  .'^{m  <  n). 
Corollary  3.  [-  {m,n):m,n  e  NC.n  >  m.  D  .'^(w  <  m). 
Corollary  4.  [-  (■m,n):m,n  e  NC.w  >  n.  D  .'^(/w  <  n). 
Theorem  XL2.21.  \-  (m,n):.m,n  e  NC:  D  -.m  <  n.  =  .m  <  n.'^{n  <  m). 
Proof.  Assume  m,n  e  NC.  Then  by  Thm.XI.2.13,  Part  III,  and  Thm. 
XI.2.20,  Cor.  1 

m  <  n.  D  .m  <  n.^^{n  <  m). 

Conversely,  since  m  —  n.  D  .n  <  m  by  Thm.XI.2.14,  Cor.  1,  we  get 
'^(n  <  m).  D  .m  9^  n,  and  so  by  Thm.XI.2.13,  Part  III, 

m  <  n.'~^{n  <  m).  D  .m  <  n. 

^Theorem  XI.2.22.     \-  {m,n):.m,n  e  NC:  D  m  <  n.  =  .(Ep).p  e  NC.n  = 
m  -\-  p. 

Proof.  Assume  m,n  e  NC.  Then  by  Thm. XI. 2. 19,  corollary,  we  can  go 
from  right  to  left  in  our  equivalence.  Now  let  m  <  n.  By  rule  C,  a  e  w. 
/3  e  n.a  C  /3.  Hence  by  Thm.IX.4.13,  Part  III,  ^S  =  a  W  /?,  so  that  by 
Thm.IX.4.4,  Part  XIX,  /3  =  «  W  (/3  -  a).  Also  a  n  (/3  -  a)  =  A.  Also 
()8  -  a)  e  Nc(/5  -  a).  Hence  ^8  €  w  +  Nc(/3  -  a)  by  Thm.X.1.1.  However, 
n  e  NC  and  (m  +  Nc(/3  -  a))  e  NC,  so  that  by  Thm.XI.2.2,  Part  IV, 
n  —  m  -\-  Nc(/3  —  a). 

Theorem  XL2.23.     |-  {m,n,p):m,n,y  e  NC.m  <  n.  D  .w  +  p  <  n  +  p. 

Proof.  Assume  the  hypothesis.  Then  by  rule  C  and  Thm.XI.2.22, 
q  e  NC.n  ^  m  +  q.  So  n  -\-  p  =  (m  -}-  q)  -\-  p  =  771  -\-  (q  -\-  p)  =  m  -\- 
Ip  +  q)  =  (m-\-  p)  +  q.    So  by  Thm.XI.2.22,  w  +  p  <  n  +  p. 

Theorem  XI.2.24.     \-  {n):n  e  NC.  D  .n  =  O.v.l  <  n. 

Proof.     Assume  n  e  NC.    Then  by  rule  C,  n  =  Nc(q:). 

Case  1.     a  =  A.    Then  n  —  0. 

Case  2.  a  5^  A.  Then  by  rule  C,  x  e  a.  So  [x]  C  «.  As  {a;}  e  1  and 
o:  6  n,  we  have  1  <  n. 

Corollary.     [-  (n):.n  e  NC:  D  :n  =  0.v.(E7n).w  e  NC.n  =  m  +  1. 
*Theorem  XI.2.25.     [-  {m,n,p):m,n,p  e  NC.?n  <  n.n  <  p.  D  .m  <  p. 

Proof.     Assume  the  hypothesis.    Then  by  Thm.XI.2.22  and  rule  C, 

qi  e  NC.n  =  m  -\-  qi 
q2  €  NCp  =  n  +  0-2. 

So  g-i  4-  Q'2  €  NC.p  =  m  +  (gi  +  ga).    So  w  <  p. 
Corollary  1.     |-  (mi,m2,ni,n2):mi,m2,ni,n2  e  NC.Wi  <  m2.n1  <  ng.   D   . 
nil  -\-  Hi  <  m2  -\-  ^2- 

Proof.     Use  Thm.XI.2.23. 

Corollary  2.     |-  (m,n,p):m,n,p  e  NC.w  <  n.n  <  p.  D  .m  <  p. 
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Corollary  3.     |-  (m,n,p):m,n,p  e  NC.w  <  n.n  <  p.  D  .m  <  p. 

Corollary  4.     |-  {m,n,p):m,n,p  e  NC.m  <  n.n  <  p.  D  .m  <  p. 

The  reader  will  observe  that  we  now  have  the  familiar  properties  of  + 
and  <  except  for  the  statement  that  any  two  cardinal  numbers  are  "com- 
parable," to  wit 

{m,n):.m,n  e  NC:  D  -.m  <  n.v.m  =  n.w.m  >  n. 

So  far  as  we  know,  there  is  no  way  to  prove  this  without  assuming  the 
axiom  of  choice.    We  shall  say  more  on  this  point  in  Chapter  XV. 

To  put  it  otherwise,  we  have  proved  that  NC  is  partially  ordered  by  <, 
but  we  have  not  proved  that  it  is  simply  ordered. 

We  now  introduce  multiplication  of  cardinals,  and  define 

m  Xcti         for         a(E(8,7)./5  e  m.y  e  n.a  sm  (/3  X  7), 

where  a,  /3,  and  7  are  variables  which  do  not  occur  in  m  or  n. 

We  shall  usually  omit  the  subscript  c,  letting  the  reader  decide  from  the 
context  whether  a  X  /3  or  m  X  e  ^  is  intended. 

The  free  occurrences  of  variables  in  m  X  n  are  just  those  of  m  and  n. 

For  stratification,  m  and  n  have  to  have  the  same  type,  which  v.'ill  be  the 
type  oi  m  X  n. 

Theorem  XI.2.26.  [-  {a,in,n):.a  em  Xc  n-.  =  :(E/3,7)./3  e  m.y  e  n. 
a  sm  (iS  X  7)- 

Theorem  XI.2.27.     \-  (^,7).Nc(^)  Xc  Nc(7)  =  Nc(/3  X  7)- 

Proof.  Let  a  e  Nc(^)  Xc  Nc(7).  Then  by  rule  C,  )8i  e  Nc(/3).7i  e  Nc(7). 
a  sm  (13 i  X  7i).  Hence  /3i  sm  /3  and  71  sm  7.  So  by  Thm.XI.1.18,  Part  II, 
(i8i  X  7i)  sm  (/3  X  7).    So  o:  sm  (/3  X  7)-    Hence  a  e  Nc(^  X  7)-    So 

(1)  h  Nc(^)  Xc  Nc(7)  Q  Nc(/3  X  7). 

Conversely,  let  a  e  Nc(/3  X  7).  So  a  sm  (,S  X  7) .  However,  ^  e  Nc(|8)  and 
7  e  Nc(7).    Hence  a  e  Nc(^)  Xc  Nc(7).    Thus 

(2)  h  Nc(/3  X  7)  ^  Nc(/3)  Xc  Nc(7). 

Corollary.     \-  {m,n):m,n  e  NC.  D  .m  X  n  e  NC. 
**Theorem  XI.2.28.     \-  {m,n).m  X  n  ^  n  X  m. 

Proof.     By  Thm.  XI. 1.17,  ^  «  sm  (^  X  7).  =  .«  sm  (7  X  0).    So 

\-  (E/3,7)./3  e  m.y  e  n.a  sm  (,S  X  7) 

.  =  .(E;S,7).iS  e  m.y  e  n.a  sm  (7  X  /3) 

.  ^  .(E7,/3).7  e  n./3  e  m.o:  sm  (7  X  ;S). 

**Theorem  XI.2.29.     \-  {m,n,p):m,n,p  e  NC.   D   .{m  X  n)   X  p  =  m  X 

{n  X  p). 
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Proof.  Let  m,n,p  e  NC.  Then  by  rule  C,  w  =  Nc(a;),  n  =  Nc(/3), 
p  =  Nc(7).  Then  by  Thm.XI.2.27,  m  X  n  =  Nc(a  X  /3)  and  n  X  p  = 
Nc(/3  X  t)-    So  by  Thm.XI.2.27, 

(m  X  n)  X  p  =  Nc((«  X  13)  X  y), 
ni  X  (n  X  p)  =  Nc(a  X  (^3  X  7)). 
But,  by  Thm.XI.1.19, 

Nc((«  X  /3)  X  7)  =  Nc(«  X  (/3  X  7)). 

**Theorem  XI.2.30.     |-  {m,n,p):m,n,p  e  NC,  D  .(m  -{-  n)  X  p  —  (jn  X  p) 

+  (nX  p). 

Proof.  Let  m,n,p  e  NC.  Then  w  +  n  e  NC.  So  by  rule  C,  m  -{-  n  = 
Nc(q:),  p  =  Nc(5).  Then  a  e  m  +  n,  so  that  by  Thm.XLLl  and  rule  C, 
iQ  e  m.7  e  W./3  n  7  =  A./3  W  7  =  a.  So  w  =  Nc(i3),  n  =  Nc(7),  m  -\-  n  = 
Nc(/3  W  7).    Now  by  Thm.XL2.27, 

(1)  (m-\-  n)  X  p  =  Nc((/3  W  7)  X  8). 

But  by  Thm.X.3.11, 

(2)  (j8  W  7)  X  5  =  (/3  X  6)  W  (7  X  5) 

(/3  n  7)  X  5  =  (/3  X  5)  n  (7  X  5). 

However,  /3  M  7  =  A.  So  by  Thm.X.2.12,  Cor.  3,  (^  X  5)  n  (7  X  5)  =  A. 
Hence,  by  (1),  (2),  and  Thm.XL2.9, 

(3)  (m  +  n)  Xp  =  Nc(/3  X  5)  +  Nc(7  X  5). 

By  Thm.XL2.27,  m  X  p  =  Nc(/3  X  5)  and  n  X  p  ^  Nc(7  X  5).    So 

(m  -{-  n)  X  p  ^  {m  X  p)  -]-  (n  X  p). 

^Theorem  XI.2.31.  |-  {m,n,p):m,n,p  e  NC.  D  jn  X  {n  -\-  p)  =  (m  X  n) 
+  (m  X  p). 

Proof.  Let  m,n,p  e  NC.  Then  ?w  X  (n  -^  p)  —  {n  -]-  p)  X  m  = 
(n  X  rn)  -\-  (p  X  m)  =  (m  X  n)  -\-  (in  X  p). 

Theorem  XI.2.32.     \-  {m):m  9^  A.  D  jn  X  0  =  0. 

Proof.     By  Thm.XL2.26, 

\-  a  e  m  X  0.  =  .(E,S,7)./3  e  m.7  e  O.a  sm  (/3  X  7) 
.  =  .(E^,y). 13  6  m.7  =  A.a  sm  (/3  X  t) 
.  =  .(E^).|S  e  m.a  sm  (^3  X  A). 

However,  by  Thm.X.2.12,  Cor.  3,  /3  X  A  =  A.    So 

|-  a  e  w  X  0.  =  .(E,8)./3  e  m.a  sm  A 
.  =  .(E/3).iS  em.a  eO 
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by  Thm.XI.1.9,  corollary.    So  if  m  ?^  A,  then  (E/3)  /3  e  m,  and  so 

\-  a  €  m  X  0.  ^  .a  €  0. 

Corollary.     \-  {m):m  e  NC.  D  .m  X  0  =  0. 
♦Theorem  XI.2.33.     |-  i^)-^  ^  NC.  D  .m  X  I  =  m. 
Proof.  Let  m  e  NC.  Then  by  rule  C,  m  =  Nc(5).  Now  by  Thm.XI.2.26, 

a  e  m  X  1.  =  .(E^,7).|8  e  m.y  e  l.a  sm  (/3  X  t) 

.  s  .(E(3,j,x).3  sm  5.7  =  {a;}. a  sm  (13  X  y) 
.  =  .(E(3,x).l3  sm  8.a  sm  (/5  X  {x}). 

But  by  Thm.XI.1.20, 

|-  a  sm  (/3  X   {x}).  =  .a  sm  13. 
So 

a  e  m  X  1.  =  .(E^,.x).i3  sm  8.a  sm  ^ 

^  .(EjS).^  sm  5.«  sm  jS 

=  .a  sm  5 

=  .a  e  Nc(5) 

=  .a  e  w. 

Corollary  1.     \-  (m):m  e  NC.  D  .m  X  2  =  m  -^  m. 

Corollary  2.     [-  (m):m  e  NC.  D  .mX3  =  m  +  m  +  m. 

Theorem  XI.2.34.     |-  {m,n):m,n  e  NC.m  X  n  =  0.  J  jn  =  Own  =  0. 

Proof.  Let  7n,w  e  NC  and  m  X  n  —  0.  Then  by  rule  C,  7n  =  Nc(q;), 
n  =  Nc(^).  Then  by  Thm.XL2.27,  m  X  n  =  Nc(a  X  /3).  So  a  X  /3  e 
m  X  n.    So  a  X  jS  €  0.    So  a  X  /?  =  A. 

Case  L     a  =  A.    Then  m  =  Nc(A)  =0.    So  m  =  Ovn  =  0. 

Case  2.  a  ?^  A.  Since  a  X  A  =  A  (see  Thm.X.2.12,  Cor.  3),  we  have 
a  X  A  =  a  X  /3.  Then  by  Thm.X.4.4,  corollary,  Part  II,  /3  =  A.  So 
n  =  Nc(A)  =  0.    So  m  =  Ovn  -=  0. 

This  theorem  is  very  useful.  It  is  used  to  prove  the  corresponding  result 
for  real  and  complex  numbers.  This  latter  result  is  very  widely  used. 
One  standard  use  occurs  in  solving  equations.    Thus  to  solve 

re'  -  4a;  +  3  =  0, 

we  factor  the  left  side  and  write 

(x  -  S)(x  -  1)  =  0. 

Then  we  infer  that  one  of  a:  —  3  or  x  —  1  must  be  zero,  and  hence  that 
a;  =  3  or  X  =  L 

This  is  a  standard  routine,  and  its  justification  goes  back  to  the  theorem 
just  proved. 

Theorem  XI.2.35.     \-  {m,n,p):m,n,p  e  NC.m  <  n.  D  .m  X  p  <  n  X  p. 
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Proof.  Assume  m,n,p  e  NC  and  m  <  n.  Then  by  Thm.XI.2.22  and 
rule  C,  g  e  NC.n  =  m  -\-  q.  Then  n  X  p  ^  (m  -\-  q)  X  p  =  (m  X  p)  + 
(q  X  p).    So  by  Thm.XI.2.22,  m  X  p  <  n  X  p. 

Corollary.  |-  (mi,m2,ni,n2):mi,m2,ni,n2  e  NC.Wi  <  ni2.n1  <  n^.  D  . 
nil  X  ni  <  ma  X  W2. 

Theorem  XI.2.36.     \-  (m,n):m,n  e  NC.?i  9^  0.  D  .m  <  m  X  n. 

Proof.  Assume  m,n  e  NC  and  n  ?^  0.  Then  by  Thm.XI.2.24,  1  <  n. 
So  by  Thm.XI.2.35,  m  X  1  <  m  X  n.    Then  by  Thm.XI.2.33,  m  <  m  X  n. 

Theorem  XI.2.37.     \-  Nc(V)  X  Nc(V)  =  Nc(V). 

Proof.  By  Thm.IX.4.G,  corollary,  ^  V  ?^  A,  so  that  by  Thm.XI.2.7, 
Cor.  2,  [-  Nc(V)  9^  0.  Hence,  by  Thm.XL2.36,  [-  Nc(V)  <  Nc(V)  X  Nc(V). 
However,  by  Thm.XI.2.16,  corollary,  \-  Nc(V)  X  Nc(V)  <  Nc(V).  So  by 
Thm.XI.2.20,  h  Nc(V)  X  Nc(V)  =  Nc(V). 

Corollary,     h  (V  X  V)  sm  V. 

Proof.     By  Thm.XI.2.27,  [-  Nc(V)  X  Nc(V)  =  Nc(V  X  V). 

This  corollary  states 

h  (ER).Y  sm«  (V  X  V). 

That  is,  the  set  of  all  objects  is  in  one-to-one  correspondence  with  the  set  of 
all  ordered  pairs.  That  is,  there  are  exactly  as  many  objects  as  there  are 
pairs  of  objects.  This  sounds  contradictory  at  first,  but  we  shall  learn  that 
for  many  infinite  classes  a  which  occur  in  mathematics,  a  sm  (a  X  a)  is 
provable.  This  is  one  of  the  ways  in  which  infinite  classes  can  differ  from 
finite  classes. 

We  now  define  exponentiation  of  cardinal  numbers.  This  is  done  in 
terms  of  a  /V  /3,  as  indicated  in  the  preceding  section.    We  define 

m  Ac  n         for         7(E«,/3).USC(«)  e  m.USC(j8)  e  n.y  sm  (a  /V  ^), 

n"'        for        m  /^^  n. 

In  the  first  of  these,  a,  ^,  and  7  are  variables  which  do  not  occur  in  m  or  n. 

We  shall  usually  omit  the  subscript  c,  letting  the  reader  decide  from  the 
context  whether  a  /^  13  or  m  /^^  n  is  intended.  The  notation  n"  is  unam- 
biguous, since  it  always  means  m  /V,.  n. 

The  free  occurrences  of  variables  in  m  /^  n  are  just  those  of  m  and  n. 

For  stratification,  m  and  n  have  to  have  the  same  type,  which  will  be  the 
type  of  m  /^^  n. 

Note  that  this  is  not  the  same  situation  that  holds  for  a  /V  /9,  since 
a  /^  ^  is  one  type  higher  than  the  type  of  a  and  /3.  If  exponentiation  is  to 
avoid  type  difficulties,  n"'  must  have  the  same  tj^pe  as  m  and  71.  We 
arranged  this  by  using  USC(a)  e  m  and  USC(/3)  e  n  instead  of  a  e  m  and 
/3  €  n  in  the  definition  of  m  /V^  n.  This  leads  to  certain  complexities  in 
deriving  the  laws  of  exponents,  but  these  complexities  are  not  serious. 
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The  notation  m  /^  n  for  vT  is  motivated  as  follows.    If 

denotes 

then  it  seems  reasonable  to  use  m  /^  niox  vT. 

It  will  be  noted  that  n"  =  A  unless  (Ea).USC(a)  e  m  and  (E/3).USC(/3)  e  n. 
There  are  cardinal  numbers  w  and  n  io\  which  vT  =  A,  and  these  cardinal 
numbers  fail  to  satisfy  some  of  the  usual  laws  of  exponents.  However,  for 
all  the  cardinal  numbers  m  which  occur  in  ordinary  mathematics,  (Ea). 
use  (a)  €  m  holds,  and  so  all  the  usual  laws  of  exponents  hold  for  these 
cardinal  numbers. 

Theorem  XI.2.38.  h  (7,W;W):7  e  n"^ .  =  .(E«,/3).USC(a)  e  w.USC(/3)  e  n. 
7  sm  (a  /^  /3). 

Theorem  XI.2.39.     h  (a,i8).Nc(USC(a))  Ac  Nc(USC(/3))  =  Nc(a  A  i^)- 

Frooj.  Since  h  USC(a)  e  Nc(USC(q;))  and  h  USC(/3)  e  Nc(USC(/3)),  it 
follows  by  Thm.XI.2.1,  and  Thm.XI.2.38,  that  h  7  e  Nc(«  A  /3).  ^  . 
7  e  (Nc(USC(a)))  A.  Nc(USC(/3)).    That  is, 

(1)  h  Nc(«  A  /?)  ^  Nc(USC(«))  A.  Nc(USC(i3)). 

Now  let  7  €  Nc(USC(a))  Ac  Nc(USC(^)).  Then  by  Thm.XI.2.38  and 
rule  C,  USC(0)  e  Nc(USC(«)),  USC(e)  e  Nc(USC(/3)),  and  7  sm  (0  A  ^)- 
Then  by  Thm.XI.2.1,  and  Thm.XI.1.33,  we  have  <^  sm  a  and  B  sm  /3.  Then 
by  Thm.XI.1.23,  corollary,  (0  A  ^)  sm  (a  A  iS).  So  7  sm  (a  A  /3)-  So 
7  e  Nc(q:  A  /5)-    Then  by  (1),  our  theorem  follows. 

As  we  said  before,  n"  =  A  unless  (Ea).USC(a)  e  m  and  (Ej8).USC(^)  e  n. 
So  we  devote  a  few  theorems  to  conditions  under  which  (Eq;).USC(q:)  e  w. 

Theorem  XL2.40.     h  0  =  Nc(USC(A)). 

Proof.  By  Thm.IX.6.12,  Part  IV,  [-  A  =  USC(A).  So  \-  Nc(A)  = 
Nc(USC(A)).    Then  our  theorem  follows  by  Thm.XI.2.7. 

Theorem  XI.2.41.     h  (a^)-!  =  Nc(USC({.t})). 

Frooj.  By  Thm.IX.6.16,  h  \{A\  =  USC(|:r:}).  So[-Nc(Ux}})  = 
Nc(USC({a:})).    Then  our  theorem  follows  by  Thm.XI.2.8. 

Theorem  XI.2.42.     h  (m):m°  ?^  A.  =  .(Ea).USC(a)  e  w. 

Proo/.  Suppose  m°  5^  A.  Then  7  e  w°,  so  that  by  Thm.XI.2.38, 
(E«).USC(a)  e  m.    So 

(1)  h  w"  5^  A.  D  .(E«).USC(a)  6  m. 

Conversely,  assume  (Ea).USC(Q:)  e  w.  Then  USC(q;)  e  w,  and  by  Thm. 
XI.2.40,  USC(A)  e  0.    So  by  Thm.XI.2.38,  (A  A  «)  ^  m\    So 

(2)  h  (Ea).USC(a)  e  m.  D  .m°  ?^  A. 
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Corollary.     \-  {m):.m  e  NC:  D  -.m    ^  k.  =  .(Ea).m  =  Nc(USC(a)). 

Because  of  this  theorem,  we  can  (and  often  will)  use  the  shorter  notation 
m°  5^  A  in  place  of  the  lengthier  notation  (Eq;).USC(q:)  e  m. 

Theorem  XI.2.43.  Y  {m,n):.m,n  e  NC:  D  :(w  +  nf  ?^  A.  =  .m°  9^  A. 
n"  ^  A. 

Proof.  Assume  (m  +  n)°  9^  A.  Then  USCCa)  e  w  +  n  by  Thm.XI.2.42. 
Then  (3  e  w.7  e  /i./?  n  7  =  A.^  W  7  =  USC(«).  By  Thm.IX.4.13,  Cor.  7, 
/S  C  USC(q:)  and  7  C  USC(a).  By  Thm.IX.6.14,  Cor.  2,  ^  =  USC(0), 
7  =  USC(^).    So  771°  5^  A.w"  5^  A.    Thus 

(1)  h  (w  +  ^)°  ?^  A.  D  .771°  5^  A.n°  ?^  A. 

Conversely,  assume  m,n  e  NC,  77i°  ?^  A,  7i°  5^  A.  Then  by  Thm.XI.2.42, 
corollary,  m  =  Nc(USC(^)),  n  =  Nc(USC(7)).  By  Thm.XI.1.20,  /3  sm 
(13  X  0)  and  7  sm  (7  X  {V}).  Hence  by  Thm.XI.1.33,  77i  =  Nc(USC(/3  X 
0))  and  n  =  Nc(USC(7  X  {V})).  As  in  the  proof  of  Thm.XI.2.10,  we  get 
(,8  X  0)  n  (7  X  {V})  =  A.  By  Thm.IX.6.12,  Parts  I  and  IV,  USC(/3  X  0) 
n  USC(7  X  {V})  =  A.  By  Thm.XI.2.9,  m  +  n  =  Nc(USC(/3  X  0)  W 
USC(7  X  {V|)).  Then  by  Thm.IX.6.12,  Part  II,  m-]-n  =  Nc(USC((^  X 
0)  W  (7  X  {V}))).    Then  (m  +  n)°  9^  A.    So 

(2)  \-  m,n  e  NC:  D  :.rn    7^  kji"  5^  A.  D  .(771  +  iif  9^  A. 

Theorem  XL2.44.     \-  (n):n  e  Nn.  D  .n°  ?^  A. 

Proof.  We  shall  prove  the  theorem  by  induction  on  n.  That  is,  we  use 
Thm.X.1.13,  taking  F(n)  to  be  n  9^  A.  By  Thm.XI.2.40  and  Thm. 
XI. 2.42, 

(1)  h^(0). 

Assume  n  e  Nn.F(7i).  Then  n  e  NC.n°  5^  A  by  Thm.XI.2.11.  Also,  by 
Thm.XI.2.41  and  Thm.XI.2.42,  \-  1  e  NC.1°  9^  A.  So  by  Thm.XI.2.43, 
(n  +  1)°  ^  A.    So 

(2)  \-  (n):n  e  Nn.F(n).  D  .F(n  +  1). 

Corollary  1.     ^  0°  ?^  A. 

Corollary  2.     ^  f  9^  A. 

Corollary  3.     ^  2°  9^  A. 

etc. 

This  result  tells  us  that  all  finite  cardinal  numbers  n  have  the  property 
n°  9^  A.  Actually  all  cardinal  numbers  7i  of  interest  in  mathematics  have 
the  property  n°  9^  A,  so  that  for  such  numbers  the  usual  laws  of  exponentia- 
tion hold. 

Theorem  XI.2.45.     |-  {m,n)'.m,n  e  NC.?7i°  9^  A.n°  9^  A.  D  .(771  X  nf  9^  A. 

Proof.     Let  m,n  e  NC.rn"  9^  A.n°  9^  A.    Then  by  Thm.XI.2.42,  corollary. 
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m  =  Nc(USC(a)),  n  =  Nc(USC(/3)).  By  Thm.XI.2.27,  m  X  n  = 
Nc(USC(a)  X  USC(^)).    By  Thm.XI.1.41,  m  X  n  =  Nc(USC(«  X  (3)). 

Theorem  XI.2.46.     [-  (m):m  =  Nc(V).  D  jn    =  A. 

Proof.  Assume  m  =  Nc(V)  and  m°  9^  A.  Then  by  Thm.XI.2.42, 
corollary,  m  =  Nc(USC(a)).  Thus  by  Thm.IX.6.12,  Cor.  2,  and  Thm. 
XI.2.13,  Part  \,  m  <  Nc(l).  However,  by  Thm.XI.2.16,  corollary, 
Nc(l)  <  TO.  So  by  Thm.XI.2.20,  m  =  Nc(l).  Hence  V  sm  1.  By  the 
definitions  of  1  and  Can(V),  this  is  Can(V).  Hence  we  have  a  contradiction 
with  Thm.XI.  1.8. 

This  theorem  shows  the  falsity  of 

{n),n  e  NC.  D  .n    9^  A. 
In  conjunction  with  Thm.XI. 2.44,  we  can  infer 

|-  ~Nc(V)  €  Nn, 

so  that  not  all  cardinal  numbers  are  finite. 

We  can  also  use  this  theorem  to  show  the  falsity  of 

{m,n):m,n  e  NC.(to  X  n^  9^  A.  D  .rn    9^  A.n°  9^  A, 

for  we  can  take  to  =  Nc(V),  n  =  0.    Then  we  have  m  X  n  =  0  by  Thm. 
XI.2.32,  corollary.    Hence  by  Thm.XI.2.40  and  Thm.XI.2.42,  (to  X  ?^)°  9^ 
A.    However,  we  have  m°  =  A. 
We  shall  show  later  how  to  prove  the  falsity  of 

{m,n):m,n  e  NC.to"  9^  K.n    f^  A.  D  .(n'")°  9^  A. 

Theorem  XI.2.47.     h  (m,n):m'  9^  A.w"  5^  A.  =  .n'^  9^  A. 

Proof.  Assume  rn  9^  A.n°  9^  A.  Then  by  Thm.XI.2.42,  USC(a)  e  to 
and  USC(/3)  e  n.  By  Thm.XI. 2.38,  («  /V  /3)  e  n"".  Conversely,  assume 
n"  9^  A.  Then  7  e  rT,  so  that  by  Thm.XI. 2.38,  (E«).USC(a)  e  to  and 
(Ei8).USC(^)  en. 

Theorem  XI.2.48.     \-  (m,n):.m,n  e  NC:  D  :to"  ^  A.n°  5^  A.  =  .n"  e  NC. 

Proof.     By  Thm.XI.2.47  and  Thm.XI.2.2,  Part  VI, 

(1)  ^n""  e  NC.  D  .m"  5^  A.n°  9^  A. 

Assume  m,n  e  NC  and  to°  5^  A.n°  9^  A.  Then  by  Thm.XI.2.42,  corollary, 
to  =  Nc(USC(a))  and  n  =  Nc(USC(^)).  By  Thm.XI.2.39,  n'  = 
Nc(a  A  ^)-    So  n"  €  NC. 

Corollary  1.     h  (w):.to  e  NC:  D  :to°  5^  A.  =  .to°  e  NC. 

Corollary  2.     h  (^):m,w°  e  NC.  3  ,(Eq;).to  =  Nc(USC(q:)). 

Corollary  3.     |-  {m,n):m,7i,7n°,n^  e  NC.  D  .n"*  e  NC. 

When  we  are  dealing  with  an  to  which  is  a  cardinal  number.  Cor.  1  gives 
us  still  another  notation,  to°  e  NC,  for  the  important  condition  (Ea), 
use  (a)  e  TO. 
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We  now  prove  a  series  of  theorems  which  will  be  recognized  as  the 
familiar  laws  of  exponents,  except  for  the  occurrence  of  conditions  such  as 
rn°  €  NC  in  the  hypotheses.  As  we  have  explained  earlier,  we  have  the 
condition  7n°  e  NC  satisfied  for  all  w's  of  interest,  and  so  have  the  usual 
laws  of  exponents  for  all  such  m's. 

Theorem  XI.2.49.     \-  (m):m,m°  e  NC.  D  .w°  =  1. 

Proof.  Assume  m,7n  e  NC.  By  Thm.XI.2.48,  Cor.  2,m  =  Nc(USC(a)). 
By  Thm.XI.2.40  and  Thm.XI.2.39,  rn  =  Nc(A  A  «)•  By  Thm.XI.1.24, 
m°  =  Nc(0).    By  Thm.XI.2.8,  Cor.  1,  m°  =  1. 

Theorem  XI.2.50.     [-0°  =  1. 

Proof.  By  Thm.XI.2.40  and  Thm.XI.2.42,  [-  O"  ?^  A.  Then  by  Thm. 
XI. 2.7,  Cor.  1,  and  Thm.XI.2.48,  Cor.  1,  h  0,0°  e  NC.  Now  use  Thm. 
XI.2.49. 

This  result  is  in  distinction  to  the  case  of  0°  for  real  numbers,  which  is 
usually  considered  as  undefined. 

Theorem  XI.2.51.     h  {m):m,m   e  NC.m  ^  0.  D  .0"  =  0. 

Proof.  Assume  m,rn  e  NC,  and  m  ?^  0.  Then  by  Thm.XI.2.48,  Cor.  2, 
m  =  Nc(USC(a)).  By  Thm.XI.2.40  and  Thm.XI.2.39,  0"  -  Nc(a  A  A). 
Now,  since  USC(a)  e  m  and  m  ?^  0,  we  get  USC(q:)  5^  A  by  Thm.XI.2.7, 
Cor.  4.  By  Thm.IX.6.12,  Part  IV,  a  7^  A.  Then  by  Thm.XI.1.25, 
«  A  A  -  A.    So  0"  =  Nc(A)  =  0. 

Theorem  XI.2.52.     [-  (?n):m,TO°  e  NC.  D  .m}  =  m. 

Proof.  Assume  m,rri'  e  NC.  By  Thm.XI.2.48,  Cor.  2,  m  =  Nc(USC(a)). 
Also,  by  Thm.XI.2.41,  1  =  Nc(USC(0)).  By  Thm.XI.2.39,  m'  =  Nc(0  A 
a).  However,  by  Thm.XI.1.38,  (0  A  «)  sm  USC(«).  Hence  Nc(0  A  «)  = 
Nc(USC(«))  =  m.    So  m'  =  m. 

Theorem  XI.2.53.     |-  {m):m,m   e  NC.  D  .1""  =  1. 

Proof.  Similar  to  that  of  Thm.XI.2.52,  except  that  Thm.XI.1.27  and 
Thm.XI.2.8  are  used. 

Theorem  XI.2.54.  \-  {m,n,'p):m,n,'p,'m\{n  +  pf  e  NC.  D  .nC^"  = 
m    y.  m  . 

Proof.  Assume  m,n,p,m°,{n  -\-  p)°  e  NC.  Then  m  =  Nc(USC(q:))  and 
n  +  p  =  Nc(USC(5)).  Then  USC(5)  e  n  +  p,  so  that  4>  ^  n.d  e  p. 
4>  r\  6  =  K.4>  KJ  d  =  USC(5).  By  Thm.IX.6.14,  Cor.  2,  4>  =  USC(/3), 
e  =  USC(7).  Then  USC(/3)  e  n.USC(7)  e  p.USC(/3)  n  USC(7)  =  A. 
USC(/3)  VJ  USC(7)  -  USC(5).  Then  by  Thm.XI.2.2,  Part  III,  n  = 
Nc(USC(^)),  p  =  Nc(USC(7)),  and  by  Thm.XI.2.39 

m"      =  Nc(/3  A  «), 
nf     =  Nc(7  A  «)• 
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So  by  Thm.XI.2.27, 

w"  X  ni"  =  Nc((/3  A  «)  X  (7  A  «))• 

By  Thm.IX.6.12,  Parts  I  and  II,  USC(/3  n  7)  =  A  and  USC(/3  U  7)  = 
USC(5).  As  [-  A  =  USC(A)  by  Thni.IX.6.2,  Part  IV,  we  get  ^  n  7  =  A 
and   ^  U  7  =    5   by   Thm.IX.6.11,    Cor.    1.      Then   by   Thm.XI.1.29, 

(5A«)sm((/3A«)  X  (7A«))-  SoNc(5A«)  =  Nc((/3A«)  X  (7  A  «))• 
Thus  w""^^  =  to"  X  TO^. 

Theorem  XI.2.55.     \-  {m):m,m°  e  NC.  D  .m^  =  m  X  m. 

Proof.  By  Thm.XI.2.11,  Cor.  1,  h  2  e  NC,  and  by  Thm.XI.2.44,  Cor.  3, 
1-2°?^  A.  Then  by  Thm.XI.2.48,  Cor.  2,  h  2,2°  6  NC.  Since  h  2  =  1  +  1, 
we  have  by  Thm.XI.2.54, 

\-  m,rn   e  NC.  D  .m'  =  to'  X  to'. 

However,  by  Thm.XI.2.52, 

1"  7w,TO°  e  NC.  D  .to'  =  to. 

So  our  theorem  follows. 

Theorem  XI.2.56.     \-  im,n,p):m.,n,p,p',{my  e  NC.  D  .{niT  =  rn'^\ 
Proof.     Assume  TO,n,p,p°, (to") "  eNC.    Then  p  =  Nc(USC(7)).    Also,  by 
Thm.XI.2.2,  Part  VI,  (to")°  5^  A.     By  Thm.XI.2.42,  USC(5)  e  to".     By 
Thm.XI.2.38,  USC(a)  e  to,  USC(/3)  e  n,  and  (^  A  «)  sm  USC(5).    Then  by 
Thm.XI.1.30, 

(1)  Nc(7  A  5)  =  Nc((7  X  iS)  A  «)• 

Now  by  Thm.XI.2.42,  to°  5^  A  and  n°  ?^  A,  so  that  by  Thm.XI.2.48, 
m'  e  NC.  Then  by  Thm.XI.2.2,  Part  III,  to  =  Nc(USC(a)),  n  = 
Nc(USC(/3)),  and  to"  =  Nc(USC(5)).  By  Thm.XI.2.27,  p  X  n  = 
Nc(USC(7)  X  USC(^)).  By  Thm.XI.1.41,  p  X  n  ^  Nc(USC(7  X  ^)). 
By  Thm.XI.2.39, 

(my  =  Nc(7  A  5), 

^PXn     _     ^^^^     X    /3)    A    «). 

So  by  (1), 

/       n\p  pXn 

(to  )    =  m     . 
Then  by  Thm.XI.2.28, 

(n\v  nXp 

m  )    —  m 

Theorem  XI.2.57.     \-  {m,n,p):m,n,p,n°,p°  e  NC.?n  <  n.  D  .to"  <  n". 

Proof.  Assume  the  hypothesis.  Then  ?i  =  Nc(USC(;S)),  p  = 
Nc(USC(7)).  Also,  by  Thm.XI.2.22,  q  e  NC.?i  =  ?n  +  g.  Then  USC(/3)  e 
TO  +  g.    So  4>  e  m.d  e  q.(l)  n  e  =  A.(j)  ^  d  =  USC(/3).    By  Thm.IX.6.14, 


Sec.  2]  CARDINAL  NUMBERS  387 

Cor.  2,  a  Q  I3.(f)  =  USC(a).    Then  USC(a)  e  m,  so  that  by  Thm.XI.2.2, 
Part  III,  m  =  Nc(USC(q;)).    By  Thm.XI.2.39, 

m"  =  Nc(7  /V  a) 

n"  =  Nc(t  a  j8)- 

However,  by  Thm.XI.1.31,  (7  A  «)  ^  (t  A  Z^)-    Then  by  Thm.XI.2.13, 
Part  I, 

Nc(7  A  «)  <  Nc(7  A  /3)- 
So 

m    <  n  . 

Theorem  XL2.58.     \-  {m,n):m  e  NC.m"  =  0.  D  .w  =  0. 

Proof.  Assume  m  e  NC  and  m"  —  0.  Then  A  e  m",  so  that  by  Thm. 
XI.2.38,  USC(a)  e  m.USC(^)  e  n.A  sm  (/3  A  «)•  Then  (/3  A  «)  =  A  by 
Thm.XI.1.9,  and  by  Thm.XI.1.32,  a  =  A.  So  USC(a)  =  A,  and  A  e  m. 
Then  w  =  0  by  Thm.XI.2.7,  Cor.  4. 

Theorem  XI.2.59.  |-  {m,n,p):m,n,p,m°,p°  e  NC.n  <  p.m  ^z^  0.  D  . 
w?"  <  rrf. 

Proof.  Assume  the  hypothesis.  Then  by  Thm.XI.2.22,  q  e  NC.p  = 
w  +  g.  By  Thm.XI.2.48,  Cor.  2,  and  Thm.XI.2.42,  corollary,  p"  5^  A. 
By  Thm.XI.2.43,  n  ^  A.g°  ?^  A.  Also  by  Thm.XI.2.48,  Cor/l,  m°  5^  A. 
Then  by  Thm.XI.2.48,  w"  e  NC,  m=  e  NC.    By  Thm.XI.2.54, 

(1)  m''  =  m"  X  w'^. 

By  Thm.XI.2.58,  m"  9^  0.    So  by  Thm.XI.2.36, 

(2)  m"  <  to"  X  TO*'. 

By  (1)  and  (2),  our  theorem  follows. 

Theorem  XI.2.60.     Y  {m,ri)xm,n  e  NC.??i"  =  1.  D  .n  =  Ovm  =  1. 
Proof.     Assume  m,n  e  NC  and  to"  =  1 .    Then  to"  5^  A,  so  that  by  Thm. 
XI.2.47,  TO°  5^  A  and  11"  9^  A.    By  Thm.XI.2.48,  Cor.  1,  m\  n°  e  NC. 
Case  1.     n  —  0.    Then  n  =  Ovto  =  1. 
Case  2.     n  5^  0.    Then  by  Thm.XI.2.24, 

(1)  1  <  n. 

By  Thm.XI.2.51, 7n  =  0  D  m"  =  0.  So  by  Thm.IX.6.15,  Cor.  2,  m  5^  0, 
Hence  by  Thm.XI.2.24, 

(2)  1  <  7n. 

Also,  by  (1)  and  Thm.XI.2.59,  to'  <  77i\  Then  by  Thm.XI.2.52, 
to  <  to",  so  that 

(3)  TO  <  1. 
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By  (2)  and  (3)  and  Thm.XI.2.20,  w  =  1,  so  that  n  =  Ov/n  =  1. 

Hence,  in  either  case,  n  =  Ovm  —  1. 

For  most  of  the  famihar  classes  a  of  mathematics,  Can(Q:)  holds.  We 
now  consider  some  of  the  special  properties  of  Nc(q:)  when  Can  (a). 

Theorem  XI.2.61.  |-  {m):.m  e  NC:  D  :(Ea).a  e  m.Can(a).  =  .(a). 
a  €  771  D  Can(a). 

Proof.  Assume  m  e  NC,(Ea). a  €  m.Can(a:),  and /3  e  w.  Then  by  rule  C, 
a  e  m.CanCa).  Then  a  sm  ^  by  Thm.XI.2.3,  Part  I.  So  Can(i8)  by  Thm. 
XI.  1.34.    Thus 


So 


m  €  NC,  (Ea).a  e  m.Csin(a),  /3  e  m  |-  Can(/3). 
m  e  NC,  (Ea).a  e  m.Can(«)  \-  (13).^  em  D  Can(/3). 


So 

(1)  m  e  NC  \-  (Ea).a  e  m.Can(a;).  D  .(a). a  e  m  D  Can(Q:). 

Conversely,  assume  m  e  NC  and  (a). a  e  rti  D  Can  (a).  By  Thm.XI.2.2, 
Part  VI,  and  rule  C,  a  e  m.    So  Can(o:).    So  (Ea).a  e  m.Can(Q:). 

Theorem  XI.2.62.     \-  {oc,m):m  e  NCa  e  w.Can(Q;).  D  .USC(a)  e  w. 

Proof.     Use  Thm.XI.2.3,  Part  II,  with  the  definition  of  Can(a). 

Corollary.     |-  {m):.m  e  NC:(Eq:).q:  e  m.Can(Q;):  D  :m°  7^  A. 

Proof.     Use  Thm.XI.2.42. 

Theorem  XI.2.63.  [-  {m,7i):.m,n  t  NC:(EQ:).a  e  m.Can(Q!):(Ea).Q!  e  n. 
Can(a):  D  jTi"  e  NC. 

Proof.     Use  Thm.XI.2.62,  corollary,  with  Thm.XI.2.48. 

Corollary.     \-  {m):.m  e  NC:(Ea).Q:  e  m.Can(a):  D  :m°  e  NC. 

Theorem  XI.2.64.  [-  {m,n):.7n,n  e  NC:(Ea).Q:  e  w.Can(Q!):(EQ:).Q:  e  n. 
Can(a:):   D   :(Ea:).Q:  e  m  +  n.Can(a:). 

Proof.  Assume  the  hypothesis.  Then  m  +  n  e  NC  by  Thm. XI. 2. 10. 
Hence  a  e  m  +  n  by  Thm.XI.2.2,  Part  VI,  and  rule  C.  Then  /3  e  in.y  e  n. 
/3  n  7  =  A.^  W  7  =  a.  By  Thm.XI.2.61,  Can(/3)  and  Can(7).  Then 
Can(/3  U  7)  by  Thm.XI.1.39.  Thus  Can(a),  so  that  (Ea).a  e  ?n  +  n. 
Can  (a). 

Theorem  XI.2.65.     \-  Can(A). 

Proof.     By  Thm.IX.6.12,  Part  IV,  ^  A  sm  USC(A). 

Corollary,     h  (^a).(x  e  O.Can(a). 

Theorem  XI.2.66.     \-  (a:).Can({.T}). 

Proof.  By  Thm.XI.1.10,  \-  {x}  sm  {{x}}.  So  by  Thm.IX.6.16, 
\-  {x}  smUSC({a;}). 

Corollary.     \-  (Ea).a  e  l.Can  (a). 

Theorem  XI.2.67.  \-  (7n):.m  e  NC.(Ea).a  e  77i.Can(a):  D  :(Ea).a  e  /n  +  1. 
Can(a). 
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Proof.     Use  Thm.XI.2.64  and  Thm.X1.2.6G,  corollary. 
Corollary  1.     [-  (Ea).a  e  2.Can(a). 
Corollary  2.     \-  (Ea).a  e  S.CanCa). 
etc. 

One  might  think  that  by  combining  Thm.XT.2.65,  corollary,  and 
Thm.XI.2.67,  one  could  prove 

{n):n  e  Nn.  D  .(Ea).a  e  n.Can(Q;) 

by  induction  (Thm.X.1.13)  on  w.  However,  {Ea).a  e  n.Can(a)  is  not 
stratified,  and  so  one  is  not  entitled  to  use  Thm.X.1.13.  As  far  as  we 
know,  the  result  in  question  cannot  be  proved.  This  seems  a  bit  surprising, 
since  we  can  prove  each  of 

\-  {Ea).a  e  O.Can(a), 
[-  (Ea).a  €  l.Can(Q:), 
h  (Ea).a  €  2.Can(a), 
etc. 

However,  the  distinction  between  proving  each  of  the  results  just  listed 
and  proving 

|-  (n):7l  e  Nn.   D  .(Ea).a  e  n.Ca.n{a) 

is  just  the  distinction  between  the  use  of  induction  to  prove  results  about 
the  S3'mbolic  logic  and  the  use  of  induction  w^ithin  the  symbolic  logic. 

Theorem  XI.2.68.  \-  {m,n):.{Ea).a  e  m,Can(a):(Ea).a  e  n.Can(a):  D  -. 
{Ea).a  e  m.  X  w.Can(a:). 

Proof.  Assume  the  hypothesis.  By  rule  C,  a  e  m.Can(a:)  and  /5  e  n. 
Can(/3).  So  by  Thm.XI.2.26,  {a  X  0)  e  m  X  n,  and  by  Thm.XI.1.42, 
Can(a  X  /3). 

Theorem  XL2.69.  \-  {m,n):.m,n  e  NC:(EQ;).a  e  ■m.Can{a):(Ea).a  e  n. 
Can(«):  D  ■.iEa).a  e  n".Can(a;). 

Proof.  Assume  the  hypothesis.  By  rule  C,  a  e  m.Can(a!)  and  ^  e  n. 
Can(/3).  By  Thm.XI.2.62,  USC(a)  e  m  and  USC(/3)  e  n.  So  by  Thm. 
XI. 2.38,  (a  A  /3)  e  tT,  and  by  Thm.XI.1.45,  Can  (a  A  iS)- 

Theorem  XI.2.70.     \-  (a,m):m  =  Nc(USC(a)).  D  .2'"  =  Nc(SC(a)). 

Proof.  Temporarily  let  A  denote  {A,V}.  By  Thm.IX.4.6,  corollary, 
and  Thm.XI.1.16,  Cor.  1,  j-  A  e  2.  So  by  Thm.XI.2.67,  Cor.  1,  and 
Thm.XI.2.61,  [-  Can(A).  Then  by  Thm.XI.2.62,  [-  USC(A)  e  2.  Hence, 
\-2  =  Nc(USC(A)).  If  now,  m  =  Nc(USC(a)),  then  by  Thm.XI.2.39, 
2"  =  Nc(«  A  A).  However,  by  Thm.XI.1.28,  Nc(a:  A  -^)  =  Nc(SC(a)). 
So  2""  =  Nc(SC(«)). 

We  can  now  show  the  falsity  of 

(m,n):m,n  e  NC.w°  ^  A.n°  9^  A.  D  .(tT^  7^  A. 
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We  merely  take  m  to  be  Nc(USC(V))  and  n  to  be  2.  Then  \-  m,n  e  NC. 
Also  \-7n  9^  Ahy  Thm.XI.2.42.  Also  h  n°  5^  A  by  Thm.XI.2.44,  Cor.  3. 
However,  ^  n'"  =  Nc(SC(V))  by  Thm.XI.2.70,  and  so  [-  n"  =  Nc(V)  by 
Thm.IX.6.19,  Part  II.    So  by  Thm.XI.2.46,  \-  (n'")"  =  A. 

Theorem  XI.2.71.     \-  {m):m,in   e  NC.  D  .m  <  2". 

Proof.  Assume  m,m°  e  NC.  By  Thm.XI.2.48,  Cor.  2,  w  -  Nc(USC(a)). 
So  by  Thm.XI.2.70,  Nc(SC(a))  =  2".  However,  by  Thm.XI.2.17,  m  < 
Nc(SC(a)). 

EXERCISES 
XI.2.1.     Prove: 

(a)  1-AV(<.)  =  V-0. 

(b)  h(NC1<.fNC)  ePord. 

XI.2.2.     Prove  h  Nc(l)  <  Nc(V). 

XI.2.3.     Provel-(a,/3,T):.«e/3.7  5^A:  D  :(E5).(a  At)  sm  5.5  C  (^/V^). 
XI.2.4.     Prove  j-  (a,/3):a  n  ^  =  A.   D   .Nc(USC(a))  +  Nc(USC(/3))   = 
Nc(USC(a  W  i8)). 
XI.2.5.     Prove: 

(a)  h  2  X  2  =  4. 

(b)  h  2^  =  4. 

(c)  h  2'  =  4l 

XI.2.6.     Prove: 

(a)  h  Nc(V)  +  Nc(V)  =  Nc(V). 

(b)  1-2X  Nc(V)  =  Nc(V). 

XI.2.7.     Prove: 

(a)  f-  (m):m  =  Nc(V).  D  .rn    =  A. 

(b)  h  lm):m  =  Nc(V).  D  .m^  9^  m  X  m. 

XI.2.8.     Prove  [-  (a,/3):Nc(a  VJ  (3)  =  Nc(a)  +  Nc(i3  -  a). 

XI.2.9.     Prove  h  (^):^  ^  0-^  «  NC.  D  .m  X  Nc(V)  =  Nc(V). 

XI.2.10.     Prove  \-  (m,a,x):m,  e  NC.a  W  [x]  e  m  +  l.~a;  e  a.  D  .a  em. 

XI.2.11.     Prove  h  (a,^).Nc(o:  W  /3)  +  Nc(a  H  ^)  =  Nc(a)  +  Nc(/3). 

XI.2.12.  Prove  h  (i^):/2  e  Funct.Nc(Val(7?))  <  Nc(Arg(/2)).  D  . 
(Ex,y).x,y  e  Arg(i^).a;  5^  y.R{x)  =  R(y).  (Hint.  If  the  conclusion  is  false, 
then  R  e  1-1  and  Arg{R)  sm  Val(/^).)  This  is  the  basis  of  the  pigeonhole 
principle. 

3.  Finite  Classes  and  Mathematical  Induction.  We  take  the  finite 
cardinals  to  be  just  the  members  of  Nn.    Then  the  infinite  cardinals  are 


Sec.  3]  CARDINAL  NUMBERS  39I 

just  the  members  of  NC  —  Nn.    In  Sec.  1  of  Chapter  X,  we  have  given  a 
number  of  properties  of  finite  cardinals.     Many  more  properties  follow 
from  the  theorems  of  Sec.  2  of  the  present  chapter  because  of  Thm.XI.2.11 
which  says 

I-  Nn  C  NC. 

Theorem  XI.3.1.     h  ~  Nc(V)  e  Nn. 

Proof.     Use  Thms.XI.2.44  and  XI. 2.46. 

Corollary.     [-  NC  -  Nn  f^  A. 

This  theorem  states  that  the  universe  is  infinite.  It  would  be  the  obvious 
form  to  take  for  an  "axiom  of  infinity."  It  is  equivalent  to  Axiom  scheme 
13,  so  that  in  effect  we  were  assuming  the  axiom  of  infinity  when  we  as- 
sumed Axiom  scheme  13.  In  deducing  the  above  theorem  from  Axiom 
scheme  13,  we  have  proved  half  of  the  equivalence  between  Axiom  scheme 
13  and  the  axiom  of  infinity.  For  hints  as  to  how  to  prove  the  other  half 
of  the  equivalence,  see  Rosser,  1939. 

Theorem  XI.3.2.  |-  {m,n,p):m  e  Nn.n,p  e  NC.m  -\-  n  =  m  -\-  p.  D  , 
n  =  p. 

Proof.     Proof  by  induction  on  m  (Thm.X.1.13).    Let  F(x)  be 

(n,p):n,p  e  NC.x  -\-  n  =  x  -\-  p.  D  .n  =  p. 

Then  \-  F(0)  by  Thm.X.1.8.  Assume  m  e  Nn,  F(m),  and  n,p  e  NC.(w  +  1) 
+  n  =  (m  +  1)  +  p.  Then  (w  +  n)  +  1  =  (m  +  p)  +  1  by  Thms. 
X.1.9  and  X.1.11,  corollary.  By  Thm.XI.2.12,  m  -\-  n  =  m  -\-  p.  So 
n  =  phy  F{m).    So  we  have  proved  [-me  Nn./^(m).  D  .F{m  +1). 

Corollary  1.     |-  (m,n)-:m  e  Nn.n  e  NC.w  -j-  n  =  m.  D  .n  =  0. 

Corollary  2.     \-  {m,n):m  e  Nn.n  e  NC.w  9^  0.  D  .m  +  n  9^  m. 

Corollary  3.     \-  (m):m  e  Nn.  D  .m  9^  m  -\-  I. 

Corollary  4.     \-  (m,n,p):m  e  Nn.n,p  e  NC.m  +  n  <  m  +  p.  D  ,n  <  p. 

Proof.     Use  Thm.XI. 2.22. 

Corollary  5.     \-  (m,n,p):m  e  Nn.n,p  e  NC.m  +  n  <  7n  +  p.  D  .n  <  p. 

Corollary  6.     [-  (m,n,p):m  e  Nn.n,p  e  NC.n  <  p.  D  .m  +  n  <  m  -{-  p 

Proof.     Use  Thm.XI.2.23. 

This  theorem  and  its  corollaries  say  in  effect  that,  if  one  adds  or  subtracts 
the  same  finite  cardinal  to  or  from  both  sides  of  an  equation  or  an  inequality 
the  result  is  again  an  equation  or  an  inequality  with  the  same  sense.  This 
is  not  necessarily  true  for  infinite  cardinals.  For  instance,  let  m  =  Nc(V) 
n  =  Nc(V),  and  p  =  0.  Then  m  +  p  =  m.  Also,  by  Ex.XI.2.6,  Part  (a), 
m  -{-  n  =  m.    So  m  +  n  =  m  +  p,  but  n  9^  p. 

Theorem  XI.3.3.     \-  (m,n):7n  e  Nn.n  e  NC.n  <  m.  D  .n  e  Nn. 

Proof.     Proof  by  induction  on  m.    Let  F(x)  be 

(n):n  e  NC.n  <  x.  D  .n  e  Nn. 
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First  assume  n  e  NC.n  <  0.  Then  by  Thm.XI.2.15,  Cor.  2,  and  Thm. 
XI. 2.20,  n  =  0.    So  n  e  Nn.    Hence 

(1)  h  ^(0). 

Now  assume  m  e  Nn,  F(m),  and  n  e  NC.n  <  m  +  1.    By  Thm.XI.2.22, 

p  €  NC.m  +1  =  n  +  p. 

Also  by  Thm.XI.2.24,  corollary, 

p  =  0.v.(Eg).g  e  NC.p  =  g  +  1. 

Case  1.     p  =  0.    Then  n  =  m  +  1,  and  so  n  e  Nn. 

Case  2.  (Eg).g  e  NC.p  =  g  +  1.  Then  by  rule  C,  p  =  q -{-  l,m -\-  1  = 
n  -\-  q  -{-  1.  By  Thm.XI.2.12,  m  =  n  -{-  q.  Hence  n  <  m,  and  by  the 
hypothesis  F{m),  we  conclude  n  e  Nn. 

In  words,  this  theorem  says  that,  if  m  is  a  finite  cardinal,  then  any  smaller 
cardinal  is  also  finite.  This  is  one  of  the  important  properties  of  finite 
cardinals  in  intuitive  mathematics  and  so  is  one  of  the  results  which  must 
be  proved  in  the  symbolic  logic  if  we  wish  to  justify  our  decision  that  Nn 
should  be  taken  as  the  class  of  finite  cardinals. 

Theorem  XI.3.4.  \-  (7n,n):jn,n  e  Nn:  3  :n  <  m.  =  .(Ep).p  e  Nn! 
m  =  n  -\-  p. 

Proof.  Assume  m,n  e  Nn.  Then  by  Thm.XI.2.22,  we  go  from  right  to 
left  easily.  Conversely,  assume  n  <  m.  Then  by  Thm.XI.2.22  and  rule  C, 
p  €  NC.w  =  n  +  p.    By  Thm.XI.2.22,  p  <  m.    By  Thm.XI.3.3,  p  e  Nn. 

Theorem  XI.3.5.  \-  {m,n):.m,n  e  Nn:  D  -.n  <  m.  =  .(Ep).^  e  Nn. 
m  =  n  -\-  p  -\-  \. 

Proof.     Assume  m,n  e  Nn  and  n  <  m.    Then 

(1)  m  ^  n, 
and  by  Thm.XI.3.4  and  rule  C, 

(2)  p  €  Nn.?n  =  n  -{-  p. 

Then  by  (l),p  9^  0.  So  by  Thm.X.1.7,  q  e  Nn.p  =  q  +  1.  So  g  e  Nn. 
?n  =  n  +  g  +  1. 

Conversely,  assume  ni,n  e  Nn  and  (Ep).p  e  Nn.m  =  n  +  p  +  1.  Then 
by  rule  C, 

(3)  p  €  Nn.m  =  n  +  p  +  1, 
and  by  Thm.XI.3.4, 

(4)  n  <  m. 

By  Thm.XI.3.2,  Cor.  2,  and  Thm.X.1.2,  n  7^  n -\- p -\-  I,  and  so  by  (3), 
m  9^  n. 
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Corollary  1.     |-  {m,n)i.m,n  e  Nn:  D  -.n  <  m.  =  .n  -j-  1  <  m. 
Corollary  2.     |-  {m):ni  e  Nn.  D  .m  <  7n  +  1. 
Corollary  3.     |-  (m,7i):.m,n  e  Nn:  3  :w  <  w  +  1.  =  .n  <  m. 
Proof.     By  Cor.  1, 

n<m+l.  =  .w+l<m+l, 
by  Thm.XI.3.2,  Cor.  4, 

n  4-  1  <  m  +  1,  D  .n  <  m, 
and  by  Thm.XI.2.23, 

n  <  w.  D  .n  +  1  <  w  +  1. 

Theorem  XI.3.6.     |-  im,n):.m  e  Nn./i  e  NC:  D  -.m  <  n.w.m  =  n.yjn  >  n. 
Proof.     Proof  by  induction  on  m.    Let  F{x)  be 

(n):.n  e  NC:  D  :x  <  n.y.x  —  n.w.x  >  n. 

By  Thm.XI.2.15,  Cor.  2,  and  Thm.XI.2.14,  Cor.  2, 

(1)  \-F{0). 

Assume  F(m),  m  e  Nn,  and  n  e  NC.    Then  bj^  F(w), 

(2)  m  <  n.v.m  =  n.vjn  >  n. 
Case  1.     Assume  m  <  n.    Then 

(3)  m  9^  n 
and  by  Thm.XI.2.22, 

(4)  p  €  NC.n  =  m  -\-  p. 

By  (3)  and  (4),  p  5^  0.  By  Thm.XI.2.24,  corollary,  q  e  NC.p  =  g  +  1. 
By  (4),  n  =  771  -\-  q  -^  1  =  {m  +  1)  -\-  q.  So  m  +  1  <  n,  and  by  Thm. 
XI.2.14,  Cor.  2, 

m  +  1  <  n.v.m  +  1  =  n. 

Case  2.  Assume  w  =  n.yjn  >  n.  By  Thm.XI.2.14,  Cor.  2,  n  <  m. 
Also  by  Thm.XI.3.5,  Cor.  2,  m  <  m  +  1.  So  by  Thm.XI.2.25,  Cor.  3, 
71  <  m  -{-  1.    That  is,  in  +  1  >  n. 

By  truth  values, 

^  P  D  R.Q  D  S.  D  .PwQ  D  RvS. 

So,  from  (2)  we  get 

w  +  1  <  n.v.m  +  1  =  n.v.w  +  1  >  7i. 

Corollary  1.     \-  {m,n):m  e  Nn.n  e  NC.~(m  <  71).  D  ji  <  771. 
Corollary  2.     \-  {m,n):m  e  Nn.n  e  NC.'~(m  <  n).  D  .,»i  e  Nn. 
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Proof.     Use  Thm.XI.3.3. 

Corollary  3.     |-  {m,n):m  e  Nn.n  e  NC  —  Nn.  D  .m  <  n. 

Proof.     \-n  e  NC  —  Nn.  =  .n  e  NC.~  n  e  Nn. 

This  last  corollary  tells  us  that  each  finite  cardinal  is  less  than  each 
infinite  cardinal,  which  is  a  familiar  intuitive  property. 
**Theorem  XI.3.7.     \-  (m,n):.m,n  e  Nn:  D  -.m  <  n.v.m  =  n.v.m  >  n. 

Proof.     Use  Thm.XI.3.6. 

Corollary  1.     \-  {m,n):.m,n  e  Nn:  D  :'~(m  <  n).  =  .n  <  m. 

Proof.     Combine  the  theorem  with  Thm.XI.2.20,  Cor.  2. 

Corollary  2.     |-  (m,n):.m,n  e  Nn:  D  :'^(m  <  n).  =  .n  <  m. 

This  theorem  tells  us  that  any  two  finite  cardinals  are  comparable. 
Also,  the  two  corollaries  are  familiar  and  widely  used  properties  of  finite 
cardinals. 

Theorem  XI.3.8.     |-  {rn,n):vi,n  €  Nn.  D  .m  X  n  e  Nn. 

Proof.     Proof  by  induction  on  n.    Let  F(n)  be 

{m):m  e  Nn.  D  .m  X  n  e  Nn. 
By  Thm.XI.2.32,  Cor.  1, 

Assume  n  e  Nn,  F{n),  and  m  e  Nn.  Then  m  X  n  e  Nn,  and  so  by  Thm.' 
X.1.14,  (m  X  w)  +  m  €  Nn.  However,  (w  X  w)  +  m  =  (m  X  w)  + 
(m  X  1)  =  w  X  (n  +  1).    So  m  X  (n  +  1)  e  Nn.    Thus 

\-n  €Nn.i^(n).  D  .F(n  +  1). 

This  theorem  tells  us  that  the  product  of  any  two  finite  cardinals  is  a 
finite  cardinal.    This  theorem  has  a  partial  converse,  which  we  now  state. 

Theorem  XI.3.9.     \-  {m,n):m,n  e  NC.n  9^  O.m  X  n  e  Nn.  D  .m  e  Nn. 

Proof.  Assume  the  hypothesis.  Then  by  Thm.XI.2.36,  m  <  m  X  n. 
So  by  Thm.XI.3.3,  m  e  Nn. 

Theorem  XI.3.10.  \-  (m,n,p):m,n,p  e  Nn.m  ^  O.n  <  p.  D  .m  X  n 
<  m  X  p. 

Proof.     Assume  the  hypothesis.    Then  by  Thm.X.1.7, 

q  €  Nn.m  =  g  +  1. 
Also  by  Thm.XI.3.5, 

r  6  Nn.p  =  w  +  r  4-  1- 
So 

mXp  =  w2X(n  +  r+l) 

=  (m  X  w)  +  (m  X  r)  +  m 

=  (m  X  n)  +  (m  X  r)  +  (?  +  1) 

=  (m  X  n)  +  ((?n  X  r)  +  9)  +  1. 

So,  by  Thm.XI.3.5,  m  X  n  <  m  X  p. 
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Theorem  XI.3.11. 

I.  \-  (m,n,p):.m,n,p  e  Nn.m  9^  0:  D  -.n  =  p.  =  .m  X  n  =  m  X  p. 

II.  \-  {m,n,p):.m,n,p  e  Nn.m  9^  0:  D  -.n  <  p.  =  .m  X  n  <  m  X  p. 

III.  \-  (m,n,p):.m,n,p  e  Nn.w  9^  0:  D  -.n  <  p.  =  .m  X  n  <  m  X  p. 

Proof  of  Part  I.  Assume  m,n,p  e  Nn.m  5^  0  and  m  X  n  =  m  X  p.  By 
Thm.XI.3.7,  n  <  p.w.n  =  p.v.n  >  p.  U  n  <  p,  then  m  X  n  <  m  X  p  by 
Thm.XI.3.10.  Also,  if  w  >  /?,  then  m  X  n  >  ?^  X  p  by  Thm.XI.3.10.  So 
n  =  p. 

Proof  of  Part  II.     Similar. 

Proof  of  Part  III.     Combine  Parts  I  and  II  by  Thm.XI.2.14,  Cor.  2. 

This  theorem  tells  us  that  division  is  possible  if  the  divisor  is  not  zero, 
and  that  division  of  both  sides  of  an  equality  or  an  inequality  yields  an 
equality  or  an  inequality  with  the  same  sense. 

We  now  prove  that  all  the  usual  laws  of  exponents  hold  for  finite  cardinals 
without  exception,  and  also  that,  if  m  and  n  are  finite  cardinals,  then  so  is  n". 

Theorem  XI.3.12.     \-  {m):m  e  Nn.  D  .m,m^  e  NC. 

Proof.  Assume  m  e  Nn.  Then  m  e  NC.  Also,  by  Thm.XI.2.44,  m°  9^  A. 
Then  by  Thm.XI.2.48,  Cor.  1,  m%  NC. 

Corollary  1.     |-  {m):m  e  Nn.  D  .m°  =  1. 

Corollary  2.     \-{m):m  e  Nn.m  9^  0.  D  .0"  =  0. 

Corollary  3.     [-  {m)im  e  Nn.  D  .m^  =  m. 

Corollary  4.     [-  (m):m  e  Nn.  D  .1"  =  1. 

Corollary  5.     |-  {m,n,p):m,n,p  e  Nn.  D  .m""^"  =  m"  X  m". 

Corollary  6.     |-  {m):m  e  Nn.  D  .m^  —  m  X  m. 

Corollary  7.     [-  (m,n,p):m,n,p  e  Nn.m  <  ri.  D  .^n"  <  n'. 

Corollary  8.     (-  {m,n,p):m,n,p  e  Nn.n  <  p.77i  9^  0.  D  jn"  <  rrf. 

Corollary  9.     [-  {m):ni  e  Nn.  D  .m  <  2"'. 

Theorem  XI.3.13.     [-  (m,n):m,n  e  Nn.  D  .n"  e  Nn. 

Proof.     Proof  by  induction  on  m.    Let  F(.r)  be 

(n):n  €  Nn.  D  .n""  e  Nn. 

ByThm.XI.3.12,  Cor.  1, 

hF(0). 

Assume  m  €  Nn,  F(m),  and  n  e  Nn.  Then  ri"' €  Nn.  So  by  Thm.XI.3.8, 
rT  X  n  e  Nn.  However  n  =  n'  by  Thm.XI.3.12,  Cor.  3.  So  by  Thm. 
XI.3.12,  Cor.  5, 

n""  X  w  =  n""  X  n^ 

Son'^^'eNn.    Thus 

h  m  e  Nn./^(m).  D  .F(m  +  1). 
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Corollary.     \-  {m,n,p):m,n,p  e  Nn.  D  .(^i"y  =  m"^". 
Theorem  XI.3.14.     |-  {m,n,p):7n,7i,p  e  Xn.m  <  n.p  5^  0.  D  .m"  <  n". 
Proof.     Assume  m,n,p  e  Nn.m  <  n.p  9^  0.    We  prove  by  induction  on  q 
the  statement 

(1)  (q):q  e  Nn.  D  .7n""'  <  71'^'. 

Let  i^(a;)  be 

X+l         ^  I+l 

Now  0  +  1   =   1,  so  that  m°*^  =  m^  =  m  and  n'^^  —  n^  =  n.    So 
^0.1  ^  ^0+1^    That  is, 

F(0). 

Now  assume  q  e  Nn  and  F(q).    Then  m^""^  <  n'"'\    However,  w^*'"^^^"^^  = 
+  1  =  n'^'  X  n.    By  Thm.XI.3.5, 

r  €  Nn.n''*'  =  m=^'  +  r  +  1, 
s  e  Nn.n  =  m  +  s  +  1. 


So 


=  (m"^'  +  r  +  1)  X  n 
=  (m'^'  X  w)  +  (r  X  n)  +  n 

=  (m'^'  X  (m  +  s  +  1))  +  (r  X  w)  +  m  +  s  +  1 
=  (m'^'  X  m)  +  (m'"^  X  (s  +  1))  +  (r  X  n)  +  m  +  s  +  1 
+  ({771"''  X  (s  +  1))  +  (r  X  n)  +  m  +  s)  +  1. 


So  by  Thm.XI.3.5, 
Thus 


q  €  Nn.F(g).  D  .F(q  +  1). 


So  (1)  is  estabhshed.  Now  by  p  ?^  0  and  Thm.X.1.7,  q  e  Nn.p  =  q  +  1. 
So  by  (1),  m"  <  n\ 

Theorem  XI.3.15. 

I.  |-  {7n,n,p)\.7n,n,p  e  Nn.p  9^  0:  D  -.m  =  7i.  =  .771^  —  rf. 

II.  |-  {m, 71, p):.m,n,p  e  Nn.p  ?^  0:  D  -.m  <  n.  =  .tti"  <  w". 

III.  |-  {7n,7i,p):.7n,7i,p  e  Nn.p  5^  0:  D  :m  <  n.  =  .w"  <  n". 

Proo/.     Proof  similar  to  that  of  Thm.XI .3.11. 

Theorem  XI.3.16.     \-  (?n,n,p):m,n,p  e  Nn.2  <  m.n  <  p.  D  jn"  <  m". 

Proof.  Assume  the  hypothesis.  By  Thm.XI.3.5,  q  e  Nn.p  =  n  -\-  q -\-  1. 
Then  m"  =  m"  X  m"*'.  Now  ^1  <  2,  so  that  1  <  m.  Hence,  by  Thm. 
XI.3.14,  V*'  <  m'^\  So  1  <  in""'.  Also,  by  Thm.XI.2.58,  m"  5^  0. 
Then,  by  Thm.XI.3.10,  m"  X  1  <  w"  X  m'^K    So  m"  <  m^ 
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Theorem  XI.3.17. 

I.  \-  (m,n,p):.m,n,p  e  Nn.2  <  m-.  D  -.n  =  p.  ^  .m"  =  m". 
II.  |-  {m,n,p):.m,n,p  e  Nn.2  <  m-.  D  -.n  <  p.  =  .m"  <  m". 
III.  |-  (m,n,p):.m,n,p  e  Nn.2  <  m-.  D  -.n  <  p.  ^  .m"  <  m". 
Proof.     Similar  to  that  of  Thm.XL.S.l  1. 
We  have  so  far  used  induction  only  in  the  form 

F(0),  (n):n  €  Nn.i^(n).  D  .F(n  +  1)  h  (n):n  e  Nn.  D  .F(n). 

We  shall  call  this  the  principle  of  weak  induction  to  distinguish  it  from  a 
stronger  form  which  we  shall  soon  derive.  We  also  say  that  it  .starts  at 
zero.    A  form  which  starts  at  unity  is  in  common  use.    Let  temporarily 

PI  =  Nn  -  {01, 

so  that  PI  is  the  class  of  all  positive  integers.    Then  the  principle  of  weak 
induction  starting  at  unity  can  be  expressed  as 

F(l),  {n):n  e  PI.F(n).  D  .F(n  +  1)  |-  (n):n  e  PI.  D  .F(n). 

This  is  the  form  which  is  used  to  prove  by  induction  such  results  as: 
If  w  is  a  positive  integer,  then 

13  +  2-^+...+^3^!^(!LJi_1). 

Actually,  one  can  start  a  proof  by  induction  at  any  integer.  This  is 
expressed  in  the  following  theorem,  which  is  the  general  form  of  the  prin- 
ciple of  weak  induction. 

Theorem  XI.3.18.  Let  ?n  be  a  variable  distinct  from  n,  and  let  F(x)  be  a 
stratified  statement.  Then  m  e  Nn,  F(m),  {n):n  e  Nn.m  <  n.F(n).  D  . 
F{n  +  1)  h  ('>^)-n  e  Nn.w  <  71.  D  .F(n). 

Proof.     Define  G(x)  to  be  Fim  +  x).    Then 

h  F{m)  ^  G(0), 

m  e  Nn  ^  (n):n  e  Nn.m  <  n.F(n).  D  .F(n  +  1).:  =  :.(x):x  e  Nn.G(a:).  3  . 
Gix  +  1), 

m  €  Nn  ^  (n):n  e  Nn.w  <  n.  D  .F(n).:  =  :.(.!•):•?  «  Nn.  D  .G(x). 

From  these,  our  theorem  follows  readily  by  Thm.X.LlS. 

As  an  illustration  of  the  use  of  this  for  m  >  1,  we  will  prove: 

If  n  is  an  integer  >  5,  then 

(2n)!  ^  ,„_j 


in^y 


<  4" 
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Proof  by  induction  on  n.    First,  let  n  =  5.    Then 


(2n)! 

6-7-8' 
1-2-3 

=  7-2-9- 

=  4-63 

■9-10 
•4-5 

2 

<  4-64  = 

=  4^  = 

4"-^  . 

Now 

assume  n 

>  5  and 

(2n)!  ^ 

4"-\ 

Then 


(nO' 

{2{n 

^+1))! 

+  1)0' 

(27z)!(2n+  l)(2n  +  2) 

{{n 

(^iO>  +  D' 

^(2n)! 

!  (2n  +  2)(2n  +  2) 
(n  +  1)^ 

<4"-. 

•4  =  4'"".''-^.  - 

In  some  of  the  theorems  which  will  follow,  we  shall  have  rather  compli- 
cated hypotheses  and  shall  wish  to  write  the  theorems  in  a  form  which 
displays  the  hypotheses  so  that  they  are  easily  grasped  by  the  eye.  To  do 
this,  we  shall  often  write  the  various  statements  of  the  hypothesis  and 
conclusion  on  separate  lines,  separating  them  by  the  word  "yield"  to  indi- 
cate [-.  Often  we  may  number  some  of  the  statements.  Thus  we  might 
write  Thm.XI.3.18  above  in  the  alternative  form 

Theorem  XI.3.18.  Let  m  be  a  variable  distinct  from  n,  and  let  F{x)  be  a 
stratified  statement.    Then 

(1)  me  Nn, 

(2)  F{m), 

(3)  {n):n  e  Nn.m  <  n.F{n).  D  .F{n  +  1) 

yield 

{n):n  e  Nn.m  <  n.  D  .F(n). 

In  order  to  avoid  having  to  insert  trivial  hypotheses  such  as  the  hj^pothe- 
sis  that  m  and  n  are  distinct,  we  shall  agree  that  henceforth  any  variables 
appearing  in  theorems  as  distinct  letters  will  be  understood  to  be  distinct. 

We  now  consider  strong  induction.  For  simplicity,  we  first  consider  only 
the  case  in  which  the  induction  starts  from  zero.  In  weak  induction,  one 
assumes  only  n  a  Nn  and  F(n)  and  uses  them  to  derive  F(n  +1).    In  strong 
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induction,  one  assumes  each  of  n  e  Nn,  F(0),  F{1),  .  .  .  ,  F(n),  and  uses  them 
to  derive  F(n  +  1).  With  more  assumptions,  it  is  clearly  easier  to  derive 
F(n  +  !)•  Hence  one  can  carry  out  proofs  by  strong  induction  in  cases 
where  weak  induction  would  be  inadequate. 

We  often  hear  the  principle  of  strong  induction  (starting  with  zero) 
expressed  in  words  as  follows. 

If  F(0),  and  if  F(n  +  1)  follows  whenever  we  have  F(x)  for  all  x  <  n, 
then  F(n)  for  all  n. 

In  symbols,  the  principle  appears  as  follows. 

(1)  F(0) 

(2)  {n)::n  e  Nn:.(a;):.T  e  Nn..T  <  n.  D  .F(x).:  D  :.F(n  +  1) 

yield 

(n)-ji  e  Nn.  D  .F(n). 

We  can  generalize  strong  induction  to  start  at  any  integer,  and  we  prove 
the  principle  in  this  form. 

Theorem  XI.3.19.     Let  F{x)  be  a  stratified  statement.    Then 

(1)  m  e  Nn, 

(2)  F(m), 

(3)  {n)::n  6  Nn.m  <  n:.{x):x  e  Nn.m  <  x.x  <  n.  D  .F(x).:  3  :.F{n  +  1), 

yield 

(n):n  e  Nn.w  <  n.  D  .F(n). 

Proof.     Assume  the  hypotheses,  and  in  addition 

(4)  n  e  Nn.m  <  n. 
Define  G{n)  to  be 

{x)\x  €  Nn.m  <  x.x  <  n.  D  .F(x). 
Clearly,  by  (1),  (2),  and  Thm.XI.2.20, 

(5)  G(m). 

Lemma.     ip):p  e  Nn.m  <  p.G(p).  D  .G(p  +  1). 
Proof.     Assume 

(i)  p  €  Nn.?n  <  p, 

(ii)  G{p), 

(iii)  X  €  Nn.m  <  x.x  <  p  -\-  1. 
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By  (iii)  and  Thm.XI.2.14,  Cor.  2,  a;  <  p  +  l.v.a:  =  p  +  1. 

Case  1.  X  <  p  +  1.  Then  by  Thm.XI.3.5,  Cor.  3,  x  <  p.  So  by 
(iii),  (ii),  and  the  definition  of  G{p),  we  get  F{x). 

Case  2.  x  =  p  +  1.  Put  (i)  and  (ii)  into  (3),  and  get  F{p  +  1).  Then 
F(x). 

So  in  either  case,  we  get  F(x).  Then,  by  the  definition  of  G{p  +  1), 
our  lemma  follows. 

By  using  Thm.XI.3.18  with  (1),  (5),  and  our  lemma,  we  infer 

(n):n  e  Nn.m  <  n.  D  .G{n). 

So  by  (4),  G(n).  Then  by  taking  a;  to  be  n  in  G(n)  and  using  (4)  and 
Thm.XI.2.14,  Cor.  1,  we  infer  F(n). 

Corollary.     Let  F{x)  be  a  stratified  statement.    Then 

(1)  F{0) 

(2)  (w)::n  e  Nn:.(a:):X  e  Nn.a;  <  n.  D  .F(x).:  D  i.F(n  +  1) 

jdeld 

(n):n  e  Nn.  D  .F(n). 

Proof.     Take  m  =  0. 

The  principle  of  strong  induction  is  sometimes  stated  in  the  alternative 
form: 

Theorem  XI.3.20.     Let  F{x)  be  a  stratified  statement.    Then 

(1)  m  €  Nn, 

(2)  (n)::n  €  'Nn.m  <  n:.{x):x  e  Nn.m  <  x.x  <  n.  D  .F(x).:  3  -..Fin) 

yield 

(n):n  e  Nn.w  <  n.  D  .F(n). 

Proof.     Assume  the  hypotheses. 

Lemma  1       (x).x  e  Nn.m  <  x.x  <  m.  D  .F(x). 

Proof.  Assume  x  e  Nn.m  <  x.x  <  m.  Then  by  (1)  and  Thm.XL2.20, 
Cor.  1,  (m  <  a;)'~(m  <  x).  However,  by  truth  values,  \-  P'^P.  D  .Q. 
SoF{x). 

Lemma  2.  (n)::n  e  Nn.m  <  n:.(x):X  e  Nn.m  <  x.x  <  n.  D  .F(x).i  D  :, 
F(n  +  1). 

Proof.     Assume 

(i)  n  6  Nn.m  <  n 

(ii)  (x):X  e  Nn.m  <  x.x  <  n.  D  .F(x). 

By  (i), 

n  +  1  €  Nn.m  <  n  -\-  1, 
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and  by  (ii)  and  Thm.XI.3.5,  Cor.  3, 

{x):x  e  Nn.m  <  x.x  <  n  -\-  1.  D  .F(x). 

Hence  by  taking  n  to  be  n  +  1  in  (2),  we  get  F(n  +  1). 

If  we  take  w  to  be  win  (2),  and  use  (1),  Thm.XI.2.14,  Cor.  1,  and  Lemma 
1,  we  get  F(m).  Then  by  (1),  Lemma  2,  and  Thm.XL3.19,  we  deduce  our 
theorem. 

The  form  of  the  principle  embodied  in  Thm.XL3.20  has  one  less  hypothe- 
sis than  the  form  embodied  in  Thm.XL3.19.  However,  in  practice,  the 
second  hypothesis  of  Thm.XL3.20  is  usually  just  as  difficult  to  prove  as  the 
second  and  third  hypotheses  of  Thm.XL3.19.  Accordingly,  Thm.XL3.20 
is  very  little  used,  and  we  include  it  mainly  as  a  curiosity. 

As  an  illustration  of  the  use  of  strong  induction,  we  cite  the  proofs  of 
Thms.IV.5.2,  VL5.4,  VL6.3,  VL7.1,  IX.2.4,  IX.2.6,  and  others.  In  the 
proof  of  Thm.VI.7.2,  Lemma  A  is  proved  by  strong  induction  and  Lemma 
D  by  weak  induction.  We  now  prove  by  strong  induction  a  theorem  whose 
proof  by  weak  induction  would  be  difficult. 

**Theorem  XL3.21.     \-  («):.«  C  Nn.a  ^  A:  D  :iEn):n  e  a:(m).m  e  a  D 
n  <  m. 

Proof.     Let  F(x)  denote 

X  e  a:  D  :(En):n  e  a:(m).m  e  a  D  n  <  m. 
We  shall  prove  by  strong  induction  (Thm.XI.3.19,  corollary)  that 

a  C  Nn  h  (x):x  e  Nn.  D  .F(x). 
Let  us  assume 
(1)  a  C  Nn. 

By  (1)  and  Thm.XI.2.15,  Cor.  2,  (m).m  e  a  D  0  <  m.    So 
0  e  a:  D  :0  €  a:{m),rn  e  a  D  0  <  m. 


Then 


0  €  a:   D   :(En):W  e  a:(m).m  e  a  D  n  <  m. 


That  is, 

(2)  F(0). 

Lemma.  {n)::n  e  'Nn:.(x):x  e  Nn..x-  <  n.  D  .F(x).:  D  :.F(n  +  1). 

Proof.  Assume 

(i)  n  €  Nn, 

(ii)  ix):x  e  Nn.a:  <  n.  D  .F(x), 

(iii)  n  +  1  e  a. 


402  LOGIC  FOR  MATHEMATICIANS  [Chap.  XI 

Case  1.     {m).m  iaDn-\-l<m.    Then,  by  (iii), 

(En):n  e  a:(ni).m  e  a  D  n  <  m. 

Case  2.  '^{in).in  €aDn+l<m.  Then  by  duality  and  rule  C, 
m  e  a.~(n  +  1  <  ^'0-  Then  m  e  Nn  by  (1),  and  so  by  (i)  and  Thm.XI.3.7, 
Cor.  2,  m  <  n  +  1.  By  Thm.XI.3.5,  Cor.  3,  m  <  n.  Then  F{m)  by  (ii), 
and  so,  since  ?n  e  a, 

(En):n  e  a-Xrnj.m  e  a  D  n  <  m. 

Hence  this  result  holds  in  both  cases,  and  our  lemma  is  proved. 
By  (2)  and  our  lemma  and  Thm.XI.3.19,  corollary, 

(3)  a  g  Nn  [-  (x):x  e  Nn.  D  .F(x). 

We  now  prove  our  theorem.  Assume  a  C  Nn  and  a  9^  A.  Then  x  e  a, 
and  so  a;  e  Nn.    Then  by  (3),  (En):n  e  a:{m).m  e  a  D  n  <  in. 

Corollary.  |-  (a,z)::a  Q  Nn.a  5^  A.z  =  ix  (x  €  a:(y).y  e  a  D  x  <  y).-.  D  :. 
z  e  a:(y).y  e  a  D  z  <  y. 

Proof.     Assume 

(1)  a  C  Nn 

and  a  ^  A,  and  write  F(x)  for  x  ea:{y).y  ea  D  x  <  y.  Then  by  our  theorem, 

(Ex).F(a:). 
Also  by  (1)  and  Thm.XI.2.20 

{x,z):F{x).F{z).  D  .X  =  z. 

Then  by  Thm.VII.2.1, 

iE,x).F{x). 


Then  by  Thm. VIII. 2.2, 
If  we  now  assume 


F(ix  Fix)). 


z  =  LX  (x  e  a:{y).y  f.  a  D  x  <  y), 

we  are  assuming  z  =  ix  F{x),  and  we  infer  F{z),  which  is  our  theorem. 

In  words,  our  theorem  states  that  everj^  nonempty  set  of  nonnegative 
integers  has  a  least  member.  The  corollary  states  that  in  such  case 
LX  (x  e  a:(y).y  e  a  D  x  <  y)  is  this  least  member. 

Although  we  used  the  principle  of  strong  induction  to  prove  Thm.XI.3.21, 
the  latter  is  more  flexible  and  of  wider  application  than  the  principle  of 
strong  induction.  Indeed,  any  proof  that  could  be  carried  out  by  use  of 
strong  induction  could  be  carried  out  almost  as  simply  by  use  of  Thm. 
XI.3.21.     We  shall  indicate  the  procedure,  restricting  attention  to  the 
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case  in  which  the  induction  starts  with  zero.  Let  us  have  given  a  stratified 
statement  F(x)  with  the  properties 

(A)  F(0), 

(B)  (n)::n  e  Nn:.(a;):X  e  Nn.a;  <  n.  D  .F{x).:  3  :.F{n  +  1). 
Let  us  prove 

(C)  {n):n  6  Nn.  D  .F{n) 

by  reductio  ad  absurdum.  So  we  assume  the  dual  of  (C),  and  use  rule  C 
to  infer 

(D)  V  e  Nn.~F(p). 

We  now  denote 

p(p  e  Nn.'^Z^(p)) 

by  a.  Then  a  C  Nn,  and  by  (D),  a  9^  K.  Hence  by  Thm.XL3.21,  there 
is  a  least  m  in  a. 

Case  L  m  —  Q.  Since  m  e  a,  we  have  '~^F(m)  by  the  definition  of  a, 
and  hence  -^^(0),  which  contradicts  (A). 

Case  2.  w  5^  0.  Then  there  is  an  n  such  that  n  e  Nn.7?2  =  n  +  L  Since 
m  is  the  least  member  of  a,  we  have  {x):x  ea.  D  .n  <  x.  That  is,  (x):x  e  Nn. 
X  <  n.  D  .'^  x  e  a.  However,  by  the  definition  oi  a,  '^  x  e  a-.  D  -.x  e  Nn. 
D  .F(x).  So  (a:):a:  e  Nn..T  <  n.  D  ./^(x).  Then  by  (B),  F(n  +  1).  That  is, 
F(m),  contradicting  m  e  a. 

Some  mathematicians  use  Thm.XL3.21  even  in  cases  as  indicated  above 
where  strong  induction  would  avoid  reductio  ad  absurdum  and  simplify  the 
proof.  However,  there  are  many  cases  in  which  use  of  Thm.XL3.21  yields 
a  simpler  proof  than  would  result  from  the  use  of  strong  induction.  Such  a 
case  would  be  the  proof  of  the  theorem  that  every  integer  greater  than 
unity  has  a  prime  factor.  One  can  easily  prove  this  by  strong  induction, 
taking  m  =  2  in  Thm.XL3.19.  For  assume  the  theorem  for  all  m  with 
2  <  m  <  n.  Ifn+1  has  no  factor/ with  1  <  /  <  n  +  1,  then  n  +  1  is  a 
prime,  and  has  itself  as  a  piime  factor.  If  n  +  1  has  a  factor  /  with 
1  <  /  <  n  +  1,  then  2  <  f  <  n.  Then /has  a  prime  factor  p,  which  must 
then  divide  n  +  L  By  use  of  Thm.XL3.21,  the  proof  is  even  quicker. 
For  let  n  >  2.  Then  n  is  in  the  set  of  divisors  of  n  which  are  greater  than 
unity.  Let  /  be  the  least  such.  Then  /  has  no  divisor  g  with  1  <  g  <  f, 
else  g  would  be  a  smaller  divisor  of  n.  Thus  /  is  a  prime,  and  so  n  has  the 
prime  divisor  /. 

In  this  connection,  it  is  instructive  to  compare  the  two  proofs  of  unique 
factorization  given  by  Hardy  and  Wright  on  pages  19  to  22.  The  first  uses 
Thm.XI.3.21  (see  the  proof  of  Theorem  23  on  page  20  of  Hardy  and  Wright) 
and  the  second  uses  strong  induction  (see  Sec.  2.11  of  Hardj-^  and  Wright). 

Another  useful  means  of  proving  properties  of  integers  is  by  means  of  the 
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principle  of  infinite  descent.    This  operates  as  follows.    Let  it  be  required 
to  prove  (C).    We  prove 

(E)  (n):.n  e  Nn.'^  F{n):  D  :(Em).m  <  n.m  e  Nn.-^  /^(m). 

We  then  infer  (C). 

The  original  intuitive  justification  for  this  principle  was  to  proceed  by 
reductio  ad  absurdum.  Suppose  -^(nj-.n  e  Nn.  D  .F{n).  Then  there  is  an 
integer  n  for  which  '^F(n).  Then  by  (E),  there  is  a  smaller  integer  m  for 
which  '^F{m).  Then  by  (E)  again,  there  is  a  still  smaller  integer  p  for 
which  '~^F{p).  Proceeding  indefinitely  in  this  fashion,  we  produce  an 
infinite  succession  of  smaller  and  smaller  integers,  all  nonnegative.  This 
cannot  be. 

It  was  this  sort  of  intuitive  justification  which  lead  to  the  name  "infinite 
descent."  Often  one  shortens  the  phrase  and  refers  to  a  proof  which  uses 
the  principle  of  infinite  descent  merely  as  a  "proof  by  descent." 

Proof  by  descent  is  widely  used  in  number  theory.  For  an  example,  see 
Hardy  and  Wright,  pages  190  to  193,  298,  and  others. 

By  means  of  Thm.XI.3.21,  one  can  readily  justify  the  principle  of  in- 
finite descent.    We  do  so  in  the  following  theoreni. 

Theorem  XI.3.22.     Let  F{x)  be  a  stratified  statement.    Then 

(1)  m  e  Nn, 

(2)  (n):.n  €  Nn.w  <  n.^F(n):  D  •.{Ex).x  <  n.x  e  Nn.m  <  a;.~F(x), 

yield 

(n):W  e  Nn.w  <  n,  D  .F(n). 

Proof.     Assume  the  hypothesis.    Define  a  to  be 

n(n  e  Nn.m  <  n.^^F{n)). 
Then 

(3)  .  a  C  Nn, 
and  by  (2), 

(4)  {n):n  e  a.  D  .{Ex).x  e  a.x  <  n. 

We  wish  to  prove  the  theorem  by  reductio  ad  absurdum,  to  which  end 
we  assume 

■~(?z):n  e  Nn.w  <  n.  D  .F{n). 

By  duality,  this  gives  (En).w  e  a.    Then  a  9^  A.    Hence  by  (3)  and  Thm. 
XI.3.21,  (En).n  e  a.(m).m  e  a  D  n  <  7n.    By  rule  C  and  (4) 
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(m).m  i  a  D  n  <  m, 
(Ex).x  €  a.x  <  n. 

Then  by  rule  C,  x  e  a,  x  <  n,  and  n  <  x.  By  Thm.XI.2.20,  Cor.  1,  we 
have  a  contradiction. 

We  now  consider  definition  by  induction.  In  its  simplest  version,  this 
takes  the  following  form.    We  set  down  the  two  conditions 

(I)  /(O)  =  a, 

(II)  f(n  +  1)  :=  g{f(n)), 

where  a  is  a  specified  constant  and  g{x)  is  a  specified  function  value  of  x, 
and  /  is  a  function  which  is  supposedly  being  defined  by  (I)  and  (II).  It  is 
usual  to  say  that  (I)  and  (II)  define  f{n)  by  induction  on  n. 

The  intuitive  argument  for  supposing  that  (I)  and  (II)  define  a  function 
/  goes  as  follows.  Certainly,  (I)  and  (II)  specify  a  unique  value  for  /(O), 
namely, 

m  =  a. 
Then  by  putting  n  =  Oin  (II),  we  specify  a  unique  value  for  /(I),  namely, 

/(I)  =  9(f(0))  =  g(a). 
Now  we  put  n  =  1  in  (II)  and  specify  a  unique  value  for  /(2),  namely, 
/(2)  =  g(f(l))  =  g(g(a)). 

By  proceeding  in  this  manner,  we  specify  /(n)  uniquely  for  any  given 
nonnegative  integer  n.  Moreover,  it  is  clear  that  we  have  not  specified  a 
value  for  f{x)  if  x  is  not  a  nonnegative  integer.  So  we  have  specified  a 
unique  value  for  f(n)  when  and  only  when  n  is  a  nonnegative  integer.  Thus 
we  have  defined  a  function  /  with  Arg(/)  =  Nn. 

The  fallacy  in  the  intuitive  reasoning  just  presented  is  that  this  reasoning 
requires  that  we  specify  each  of  /(O),  /(I),  /(2),  .  .  .  ,  in  turn  and  then  define 
/  by  collecting  all  the  ordered  pairs  <0,/(0)>,  (1  J(l)),  <2,/(2)),  .  .  .  ,  into  a 
single  assemblage.  That  is,  we  are  required  to  define  the  infinite  class  / 
by  listing  its  members.    Clearly  this  is  not  humanly  possible. 

Thus  the  intuitive  reasoning  presented  merely  makes  it  plausible  that 
(I)  and  (II)  do  define  a  function  /. 

Actually,  (I)  and  (II)  are  circular  in  that  (II)  defines  /  in  terms  of  / 
itself.  In  general  a  circular  definition,  in  which  we  try  to  define  a  quantity 
in  terms  of  itself,  is  not  a  legitimate  definition.  However,  there  are  excep- 
tional cases  in  which  a  circular  definition  is  acceptable.    A  very  familiar 
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case  from  mathematics  occurs  with  differential  equations.  It  is  well  known 
that  the  two  conditions 

/(O)  =  1 

fix)  =  fix) 

define  a  unique  function  value  j{x)  of  real  numbers  x.  Nevertheless  the 
second  condition  is  circular,  since  we  cannot  use  it  to  obtain  j'{x)  unless 
j{x)  is  already  know^n,  and  vice  versa. 

The  standard  method  for  legitimizing  the  use  of  differential  equations, 
in  spite  of  their  feature  of  circularity,  is  to  prove  that  under  suitable  con- 
ditions there  is  one  and  only  one  function  satisfying  a  differential  equation 
with  boundary  conditions.  Then  certainly  the  differential  equation  and 
boundary  conditions  can  be  considered  as  a  definition  of  this  unique  func- 
tion, regardless  of  the  circularity  of  the  differential  equation. 

We  shall  proceed  similarly  with  definition  by  induction.  We  shall  prove 
that  under  rather  mild  stratification  conditions  there  is  exactly  one  func- 
tion /,  with  Arg(/)  =  Nn,  which  satisfies  (I)  and  (II).  Then,  in  spite  of 
the  circularity  of  (II),  it  is  certainly  acceptable  to  consider  (I)  and  (II)  as  a 
definition  of  this  unique  function.  Indeed  we  can  then  write  a  formula  for 
this  function,  namely, 

c/  (/  6  Funct:.Arg(/)  =  Nn:./(0)  =  a:.(n):n  eNn.  D  ./(n  +  1)  =  g{f{n))). 

Before  proceeding  with  the  proof  that  (I)  and  (II)  determine  a  unique 
function,  we  discuss  certain  general  aspects  of  definition  by  induction. 

As  an  illustration  of  definition  by  induction  we  cite  Hardy's  second  proof 
of  the  Weierstrass  theorem  (see  Hardy,  1947,  page  139).  In  this.  Hardy 
starts  with  an  infinite  set  S  on  the  real  line  contained  in  the  closed  interval 
PQ.  Then  he  defines  by  induction  on  n  the  intervals  7„  according  to  the 
following  conditions : 

(I)  UisPQ. 

(II)  Given  7„,  we  divide  it  into  two  equal  parts.  If  the  left-hand  half 
contains  an  infinite  number  of  points  of  S,  then  Ave  take  /„+i  to  be  the  left- 
hand  half.    Otherwise  we  take  /„  +  i  to  be  the  right-hand  part. 

By  using  definition  by  cases,  we  can  define  a  term  g{x)  such  that  (II)  can 
be  written  as 

/„.!     =     g{In). 

Thus  (I)  and  (II)  give  a  clear  case  of  defining  7„  by  induction  on  n. 
In  a  slightly  more  general  form  of  definition  by  induction,  we  WTite  down 
the  two  conditions 

(I)  /(O)  =  a, 

(HI)  /(n  -h  1)  =  h{nj{n)). 
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That  is,  we  make  f(n  +  1)  depend  upon  n  as  well  as  the  previous  value  of 
f(n).  As  an  illustration  of  this,  let  us  have  given  an  arithmetical  function 
value  r(n),  and  let  us  define 

by  induction  on  n.    We  write  the  conditions 

(I)  Zrim)  ^r{0), 

m  =  0 

(III)  E  r(m)  =  (  E  r(7n))  +  r(n  +  1). 

m=0  \    m=0  / 

Here  the  function  value  f(n)  to  be  defined  by  induction  is 

m  =  0 

and  h  is  such  that 

h(n,f(n))  =  f(n)  +  r(n  +  1). 
That  is, 

h(x,y)  =  y  +  r{x  +  1). 

In  the  instances  cited  so  far,  the  form  of  definition  by  induction  used  has 
been  analogous  to  proof  by  weak  induction  in  that  the  function  value 
/(n  +  1)  is  specified  in  terms  of  at  most  one  previous  function  value, 
namely,  f(n).  As  in  the  case  of  proof  by  induction,  one  need  not  neces- 
sarily start  with  zero,  but  can  define  f{n)  by  writing 

(I*)  /(I)  ==  a, 

(II)  f(n  +1)  =  g{J{n)). 

Clearly  this  defines  J{n)  only  for  n  >  1. 

Analogous  to  proof  by  strong  induction,  there  are  versions  of  definition  by 
induction  in  which  the  function  value  /(n  +  1)  is  specified  in  terms  of  more 
than  one  preceding  function  value.  There  is  a  particularly  important  case, 
in  which  /(n  +  1)  is  specified  in  terms  of  the  two  preceding  function  values, 
f{n)  and  j{n  —  1).  This  is  illustrated  by  the  definition  of  the  convergents 
of  a  continued  fraction.    Given  a  continued  fraction 

1  1  1 


the  numerator  p„  and  the  denominator  g-^  of  the  nth  convergent 

Vn 
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are  defined  by 

Po  =  0 

Pi  =  1 
9o  =  1 


(see  Hardy  and  Wright,  Chapter  10). 

These  are  not  genuinely  different  from  the  type  of  definition  discussed 
earher.  If  we  let  J{n)  be  {p„+i,p„),  then  we  can  define  f(n)  by  induction  on 
n  by 

(I)  /(O)  =  <i,o>, 

(III)  f(n  +  1)  =  {a„,^Q^(f{n))  +  Q,(f(n)),  Q^{f(n))). 

Then  we  can  define  p„  by  writing 

Pn  =  Q^ifin)). 

However,  there  are  cases,  in  strict  analogy  with  proof  by  strong  induc- 
tion, in  which  f(n  +  1)  is  specified  in  terms  of  all  preceding  function  values, 
/(O),  /(I),  ...  ,  f(n).  This  is  used,  for  example,  in  the  proof  that  ever}' 
real  number  has  a  decimal  expansion  (see  Hardy,  1947,  pages  150  to  151). 
As  an  illustration,  we  shall  give  the  details  for  the  slightly  simpler  proof 
that  every  real  number  x  with  0  <  a;  <  1  has  a  nonterminating  binary 
expansion.  The  critical  point  in  the  proof  is  the  definition  by  induction  on 
n  of  the  nth  digit  in  the  binary  expansion  of  x.  Let  f{n)  be  the  nth.  digit 
after  the  binary  point.  (In  dealing  with  binary  expansions,  the  term 
"binary  point"  corresponds  to  the  term  "decimal  point"  in  deafing  with 
decimal  expansions.)  It  is  convenient  to  take  /(O)  =  0,  which  has  the 
same  efi"ect  as  specifying  that  there  are  to  be  no  digits  to  the  left  of  the 
binary  point.  Given  /(O),  /(I),  .  .  .  ,  fin),  we  define  /(?i  +  1)  (which  is  the 
(n  -f-  l)st  digit)  as  follows. 

Case  1.     If 


r^ix-  i:/(m)2-")  <  1, 


take  f(n  +  1)  to  be  zero. 

Case  2.     Otherwise,  take  f{n  +  1)  to  be  1. 

This  is  a  definition  by  strong  induction  since  the  value  of  f{n  +  1)  de- 
pends upon  all  the  values  /(O),  /(I),  .  .  .  ,  f(n),  and  not  merely  on  the  value 
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f(n).  We  shall  prove  that  this  definition  does  indeed  define  a  function  /. 
Accepting  this  for  the  moment,  it  is  clear  that  the  value  f(n)  must  be  either 
0  or  1  for  each  positive  integer  n,  and  so  can  serve  as  the  nth  digit  of  a 
binary  expansion.  We  readily  prove  by  weak  induction  on  n  (using  the 
fact  that  0  <  a:  <  1),  that  for  0  <  n, 

(XI.3.1)  0  <  a:  -   X;  f(m)2~"'  <  2"". 

m  =  0 

From  this  follows 

(XI. 3 .2)  a;  =  lim  f;  /(w)2~"', 

so  that  f{n)  is  indeed  the  nth  digit  in  the  binary  expansion  of  x.  Moreover, 
this  expansion  is  nonterminating,  for  if  it  were  terminating,  this  would 
mean  that  there  is  an  N  >  0  such  that 

(n):n  eNn.n  >  N.  D  .f(n)  =  0. 

Then  by  (XI.3.2), 

N 

which  contradicts  (XI.3.1). 

For  a  general  formulation  of  definition  by  induction,  which  includes  even 
the  strong  form  in  Avhich  J{n  +  1)  depends  on  all  of  /(O),  /(I),  .  .  .  ,  f{n), 
we  write 

(I)  /(O)  =  a, 

(IV)  f{n  +  1)  =  j{nj). 

In  order  for  this  to  define  a  function,  we  need  the  condition 

(V)  (R,S,n)::R,S   e   Funct.Arg(i2)    Q    Nn.Arg(.S)    e    Nn.n   e  Nn:.(w): 

m  €  Nn.m  <  n.  D  .R{m)  =  S(77i).:  D  :.j(n,R)  =  j(n,S). 

The  effect  of  (V)  is  to  say  that  the  value  of  j{n,f)  does  not  involve  all 
function  values  of  /,  but  only  the  values  /(O),  /(I),  ...  ,  /(n).  Thus,  in 
effect,  (IV)  defines  /(n  +  1)  in  terms  of  /(O),  /(I),  ...  ,  f(n). 

Our  earlier  forms  of  definition  by  induction  are  included  in  the  general 
form  just  stated.    For  if  we  write  down 

(I)  m  =  a 

(II)  f(n  +  1)  =  g(f(n)), 
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then  in  (IV)  and  (V),  we  can  take  j(n,f)  to  be  g{f(n)).     Clearly  (V)  is 
satisfied.    If  we  write  down 

(I)  /(O)  =  a 

(III)  f(n  +  1)  =  h(n,f{n)), 

then  analogously  we  takej(7i,f)  to  be  h{nj{n))  in  (IV)  and  (V). 

If  we  make  a  definition  by  induction  using  (I)  and  (II)  or  (I)  and  (III), 
we  shall  say  that  it  is  a  definition  by  weak  induction.  If  we  use  (I)  and  (IV) , 
with  (V)  satisfied,  we  shall  say  that  it  is  a  definition  by  strong  induction. 

We  have  our  choice  of  various  schemes  for  proving  the  acceptability  of 
definition  by  induction.  One  scheme  is  to  wait  until  the  next  chapter 
after  we  have  proved  a  very  general  theorem  on  definition  by  transfinite 
induction.  Then  we  can  justify  definition  by  strong  induction  as  a  special 
case  of  definition  by  transfinite  induction.  Then  definition  by  weak  induc- 
tion can  be  justified  as  a  special  case  of  definition  by  strong  induction. 

This  scheme  has  a  drawback.  We  need  to  use  definition  by  weak  induc- 
tion in  the  treatment  of  denumerable  classes  in  the  next  section.  If  we 
postpone  a  proof  of  the  validity  of  definition  by  weak  induction  until  the 
next  chapter,  we  must  similarly  postpone  portions  of  the  theory  of  denumer- 
able classes. 

A  second  scheme  is  to  justify  definition  by  weak  induction  and  then  use 
definition  by  weak  induction  to  justify  definition  by  strong  induction.  This 
scheme  has  the  drawback  that  using  definition  by  weak  induction  to  justify 
definition  by  strong  induction  is  a  lengthy  and  complicated  process. 

We  shall  adopt  a  compromise.  We  shall  proceed  now  to  justify  definition 
by  weak  induction.  Thus  we  shall  have  it  available  for  use  in  the  theory  of 
denumerable  classes.  However,  as  we  have  no  immediate  need  for  definition 
by  strong  induction,  we  shall  leave  its  justification  until  the  next  chapter, 
when  we  can  treat  it  as  a  special  case  of  definition  by  transfinite  induction. 

In  justifying  definition  by  weak  induction,  it  clearly  suffices  to  justify  the 
form  involving  (I)  and  (III),  since  the  form  involving  (I)  and  (II)  is  more 
special,  resulting  from  taking  h{nj{n))  to  be  just  gif{n)).  In  our  justifica- 
tion, we  shall  have  need  for  the  following  hypothesis. 

Hypothesis  H4.  Let  a,  R,  S,  a,  m,  n,  x,  y,  and  z  be  distinct  variables. 
Let  h{x,y)  be  a  term  in  which  neither  of  n,  R,  or  S  occurs.  If  A  and  B  are 
terms,  let  h(A,B)  denote  {Sub  in  h(x,y) :  A  for  x,  B  for  y],  where  it  shall  be 
understood  that  the  substitutions  indicated  cause  no  confusion. 

We  prove  first  that  (I)  and  (III)  can  be  satisfied  by  at  most  one  function. 
We  take  the  general  case  in  which  the  induction  starts  at  m  instead  of  0. 
**Theorem  XI.3.23.     Assume  Hypothesis  H4.    Then 

(1)  m  e  Nn, 


Sec.  3]  CARDINAL  NUMBERS                                        411 

(2)  a  =  n{n  e  Nn.m  <  n), 

(3)  R,S  e  Funct, 

(4)  Aig(R)  =  Arg{S)  =  a, 

(5)  R{m)  =  S{m)  =  a, 

(6)  (n):n  e  a.  D  .R(n  +  1)  -  h(n,R(n)), 

(7)  (n):n  e  a.  D  .S{n  +  1)  =  h{n,S{n)), 


yield 


R  =  S. 


Proof.  Assume  the  hypotheses.  We  shall  use  Thm.X.5.16.  So  we 
prove  by  weak  induction  on  n  that  (n):n  e  a.  D  .Rin)  —  S(n).  So  let  F{x) 
be  R(x)  =  S(x).    By  Thm.XI.3.18,  it  suffices  to  prove 

(8)  F(m) 
and 

(9)  (n):n  e  a.F(n).  D  .F{n  +  1). 

We  get  (8)  from  (5)  and  (9)  from  (6)  and  (7). 

We  remark  that  this  theorem  continues  to  hold  if  we  replace  a  by  any 
term  with  no  free  occurrences  of  n,  since  the  theorem  can  be  reduced  to  an 
implication  by  repeated  use  of  the  deduction  theorem,  and  then  one  can  use 
rule  G  with  a,  after  which  one  can  replace  a  by  any  term  A  by  use  of 
Axiom  scheme  6  or  Axiom  scheme  8,  provided  that  this  replacement  causes 
no  confusion  of  bound  variables.  As  there  may  be  free  occurrences  of  a  in 
h(x,y),  one  must  assume  that  there  are  no  free  occurrences  of  n  in  A  if  one 
is  to  be  sure  of  avoiding  confusion. 

We  now  prove  that,  under  a  simple  stratification  condition,  (I)  and  (III) 
are  satisfied  by  at  least  one  function  /,  which  must  then  be  unique  by  the 
preceding  theorem.    The  proof  is  motivated  as  follows.    We  are  requiring 

(I)  Km)  =  a, 

(III)  /(n  4-  1)  =  h(n,f(n))         for        n  >  m. 

Thus  we  require 

f{m)  =  a, 

f(m  +  1)  =  h(m,a), 

f(m  +  2)  =  him  +  1,  h(m,a)), 

etc. 


412  LOGIC  FOR  MATHEMATICIANS  [Chap.  XI 

That  is,  we  wish  /  to  consist  of  the  following  collection  of  ordered  pairs: 

{m,a), 

{m  +  l,h{m,a)), 

(m  +  2,h(m  +  l,h(m,a))), 

etc. 
Now  we  note  that,  if 

S  =  \xy({x  +  l,h(x,y))), 
then 

Si(n,y})  =  (n  +  l,hin,y)), 
so  that 

S{{m,a))  =  {m  -\-  l,h{m,a)), 

S({m  +  l,h{m,a)))  =  {w  +  2,h(m  +  l,h(m,a))), 

etc. 

That  is,  we  wish  /  to  consist  of  the  ordered  pairs 

(w,a>, 

S({m,a)), 

S(S((m,a))), 

etc. 

This  can  be  accomplished  by  taking  /  to  be 

Clos({(w,a)},a:*S2). 

Accordingly,  we  prove  our  theorem  by  defining  /  so. 
**Theorem  XI.3.24.    Assume  the  Hypothesis  H4.    Assume  further  that 

R{n  +  1)  =  h{n,R{n))  is  stratified.    Then 

(1)  m  €  Nn, 

(2)  a  =  n{n  €  Nn.m  <  n), 

(3)  S  =  Xxyiix  +  lM^,y))), 

(4)  R  =  C\os({{m,a)},xSz) 
yield 

(5)  R  €  Funct, 

(6)  ATg(R)  =  a, 

(7)  Rim)  =  a, 

(8)  (n):n  e  a.  D  .R(7i  +  1)  =  h(n,R(n)). 
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Proof.  Assume  the  hypotheses.  If  R(n  +  1)  =  h{n,R(n))  is  stratified, 
then  X  =  y  —  h(x,y)  is  stratified.    So  by  Thm.X.5.28, 

(9)  (x,y,z):(x,y)Sz.  =  .2  =  <x  +  \,h{x,y)). 
Then  by  Thm.X.4.34, 

(10)  {m,a)  €  R, 
by  Thm.X.4.35, 

(11)  (x,y):(x,y)  e  R.  D  .{x  +  lMx,y))  e  R, 
by  Thm.X.4.36, 

(12)  {^)::{m,a)  e  ^:{w,z):W  e  ^.w  e  R.wSz.  D  .z  e  /3.:  D  :M  C  ^, 
and  by  Thm.X.4.37, 

(13)  {z):.z  e  R:  ^  :z  =  {m,a).y.(Ew).w  e  R.wSz. 

Let  /3  be  a  X  V.  Then  by  (1)  and  (2),  {m,a)  e  13.  Also,  w  e  (3.  D  .(Ex,y). 
X  e  a.w  =  {x,y).  So  if  w  e  I3.w  e  R.wSz,  then  x  e  a.{x,y)Sz.  So  by  (1),  (2), 
and  i9),x  +  1  €a.z  =  (x  +  l,h(x,y))-  Hence  2  e /3.  Thus  by  (12),  i2  C /?. 
Thus 

(14)  R  e  Rel, 

(15)  Arg(i^)  e  a. 

Let  F(x)  be  re  e  Ai'g(i2).  By  (10),  F(m).  By  (11),  (n):n  e  aj{n).  D  . 
Fin  +  1).  So  by  Thm.XL3.18,  {n)'.n  e  a.  D  .F(n).  That  is,  a  C  ArgiR). 
Then  by  (15),  we  have  estabhshed  (6). 

By  (9), 

(16)  (Ex,y).xRy.z  =  <x  +  l,h{x,y)):  D  :(Ew).w  e  R.wSz. 
li  w  e  R,  then  by  (14),  w  =  (x,y).    Consequently,  by  (9), 

(Ew).w  e  R.wSz:  D  :(Ex,y).xRy.z  =  (x  -{-  l,h(x,y)}. 
So  by  (16)  and  (13), 

(17)  {z):.z  eRi  =  :z  =-  {m,a).y.(Ex,y).xRy.z  =  (x  -{-  l,h(x,y)). 

Lemma.     (n):.n  e  a-.  D  :{u,v):nRu.nRv.  D  .u  =  v. 

Proof.  Proof  by  weak  induction  on  n  (Thm.XL3.18).  Let  F(n)  be 
(u,v):nRu.nRv.  D  .u  =  v. 

First  we  wish  to  prove  F(m).  Assume  mRu.  Then  by  (17),  {7n,u)  = 
{m,a}.w.(Ex,y).xRy.(m,u)  =  (x -{-  l,h{x,y)).  If  xRy.{m,u)  =  {x -\-  l,h(x,y)), 
then  X  fahy  (6)  and  m  =  x  -\-  1.    From  x  e  ahy  {2),m  <  x.    So  .r  +  1  <  x, 
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contradicting  Thm.XI.3.5,  Cor.  2.  Consequently  {m,u)  =  {m,a),  and  u  =  a. 
So  we  have  shown  mRu.  D  .u  =  a.    Hence  we  conclude 

(i)  F{m).     . 

Now  assume  n  e  a,  F(n),  and  (n  +  l)Ru.  Then  by  (17),  (n  +  l,w)  = 
{m,a).w.(Ex,y).xRy.{n  +  l,w)  =  {x  -\-  l,h(x,y)).  If  we  had  n  +  1  —  m, 
we  would  get  the  contradiction  n  -{-  1  <  n  from  n  e  a.  So  xRy.{n  +  l,w)  = 
(x  +  l,/i(a;,?/)).  By  (6),  x  e  a.  Also  w+1  =a;  +  l,  w  =  h(x,y),  so  that 
n  =  X.  By  this  and  xi2?/,  we  have  n/^y.  By  this  and  F(n)  and  Thm. 
VII. 2.1,  (Eiy).nRy.  So  by  Axiom  scheme  11,  nRy.  =  .?/  =  R(n).  So 
?/  =  i^(n).  But  X  =  n  and  w  =  h{x,y).  So  w  =  h{n,R(n)).  The  net  result 
of  all  the  development  since  (i)  is  to  show 

n  e  a.F{n)  |-  (n  +  l)/?w.  D  .u  =  h(n,R(n)). 

So 

n  e  a.F{n)  [-  F(n  +  1). 

Then  by  (i),  our  lemma  follows. 

Now  by  (14),  (6),  and  our  lemma,  we  establish  (5). 

By  (10),  mRa.    Hence  by  (5),  (6),  and  Thm.X.5.3,  we  establish  (7). 

Let  n  e  a.  Then  by  (5),  (6),  and  Thm.X.5.3,  Cor.  1,  nR{R{n)).  Then  by 
(11),  (n  +  l)Rhin,R{n)).  So  by  (5)  and  Thm.X.5.3,  R(n  +  1)  =  h{n,R{n)). 
Thus  we  establish  (8) . 

The  theorem  continues  to  hold  if  we  replace  a  by  any  term  with  no  free 
occurrences  of  x,  y,  z,  or  n,  for  the  same  reasons  which  we  gave  after 
Thm.XI.3.23.  We  do  not  even  need  to  require  any  stratification  conditions 
for  the  term  which  we  use  for  a. 

Note  that  by  (4),  we  can  say  that  there  is  actually  a  term  R  which  satisfies 
(5),  (6),  (7),  and  (8).  Also,  this  term  R  contains  occurrences  of  a  and  m  as 
free  variables,  and  also  all  free  occurrences  in  h(x,y)  of  variables  other  than 
X  and  y  are  free  occurrences  in  R.  Also  R  is  stratified  if  /(n  +  1)  = 
h{n,f(n))  is  stratified.  Hence  if  any  question  should  arise  about  free  occur- 
rences of  variables,  or  about  stratification,  for  the  /  defined  by  weak  induc- 
tion, we  can  assume  the  conditions  stated  above,  since  we  know  that  there 
does  exist  a  term  R  satisfying  the  conditions  stated. 

We  can  write  another  term  which  satisfies  (5),  (6),  (7),  and  (8)  and  satis- 
fies the  various  conditions  about  stratification  and  occurrences  of  free  vari- 
ables which  we  have  just  cited.    This  is 

iR  (R  e  Funct:.Arg(i2)  =  n(n  e  Nn.w  <  n):.R(ni)  =  a-.. 
(n):n  €  Nn.w  <  n.  J  .R(n  +  1)  -  h{n,R{n))). 

It  should  be  noted  that  by  trivial  variations  we  can  justify  definition  by 
weak  induction  for  other  stratification  conditions  than  those  assumed  in 
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Thm.XI.3.24.  We  shall  not  attempt  to  state  the  most  general  possible 
variation  and  the  corresponding  stratification  condition,  but  shall  show  one 
simple  variation.  With  this  as  a  model,  the  reader  can  easily  work  out 
other  variations. 

Theorem  XI.3.25.     Assume  Hypothesis  H4.    Then 

(1)  m  €  Nn, 

(2)  a  =  n{n  €  Nn.w  <  n), 

(3)  R,S  e  Funct, 

(4)  ArgiR)  =  AvgiS)  =  USC(a), 

(5)  Ri{m])  =  Si{m})  =  a, 

(6)  (n):n  e  a.  D  .R{{n  +  1})  =  h{n,R({n})), 

(7)  (n):n  e  a.  D  .S{{n  +  1})  =  h{n,S({n})), 

yield 

R  =  S. 

Proof.     Prove  by  induction  on  ?i  that 

(n):nea.  D  .R({n})  =  S({n}). 

Theorem  XI.3.26.  Assume  Hypothesis  H4.  Assume  further  that 
R{{n  -i-  1 } )  =  h(n,R( { n } ) )  is  stratified.    Then 

(1)  m  e  Nn, 

(2)  a  =  n(n  e  'Nnjn  <  n), 

(3)  S  =  Xxy(({{cv  (X  =  {v}))  +  l]M^v  (x  =  {v}),y))), 

(4)  R  =  Clos{{{{m},a)},xSz) 
yield 

(5)  R  e  Funct, 

(6)  ArgiR)  =  USC(«), 

(7)  R({m})  =  a, 

(8)  (n)'.n  e  a.  D  .R({n  +  1})  =  h(n,R({n])). 

Proof.     We  proceed  as  in  the  proof  of  Thm.XI.3.24. 

Although  we  shall  not  give  all  proofs  until  the  next  chapter,  we  shall 
state  here  for  convenient  reference  the  theorems  which  justify  definition  by 
strong  induction.    We  use  the  following  hypothesis. 
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Hypothesis  H5.  Let  a,  R,  S,  a,  f,  m,  and  n  be  distinct  variables.  Let 
j(n,f)  be  a  term  not  containing  any  occurrences  of  R  or  S.  If  A  and  B  are 
terms,  \etj(A,B)  denote  {Sub  in  j(n,f) :  A  for  n,  B  ior  f\,  where  it  shall  be 
understood  that  the  substitutions  indicated  cause  no  confusion. 

Theorem  XI.3.27.     Assume  Hypothesis  Hg.    Then 

(1)  m  €  Nn, 

(2)  a  =  ii(n  e  Nn.m  <  n),  ' 

(3)  R,S  e  Funct, 

(4)  ATg(R)  =  ArgiS)  =  a, 

(5)  /^(rn)  =  >S(m)  =  a, 

(6)  (7^):n  e  a.  D  .R{n  +  1)  =  j(^^^), 

(7)  {n):n  e  a.  D  .S{n  +  1)  =  j{n,S), 

(8)  (R,S,n)::R,S  €  Funct.Arg(72)  C  a.Arg(;S)  C  a.n  e  a:.(a;):a:  c  a.a;  <  n. 

D  .R(x)  =  S(x).:  D  :.j{n,R)  =  j(n,S)      . 
yield 

R  =  S. 

Proof.  Assume  the  hypotheses.  We  shall  use  Thm.X.5.16.  So  we 
prove  by  strong  induction  on  n  that  (n):n  e  a.  D  .R{n)  =  S{n).  So  let 
F{x)  be  R{x)  =  Six).    By  Thm.XI.3.19,  it  suffices  to  prove 

(9)  F(m) 
and 

(10)  (n)::n  €  a:.(x):x  e  a.x  <  n.  D  .F(x).:  D  :.F(n  +  1). 
However,  F(m)  follows  by  (5).    By  (3),  (4),  and  (8), 

(n)::n  e  a:.(x):X  e  a.x  <  n.  D  .F{x).\  D  :.j{n,R)  =  j{n,S). 

Then  by  (6)  and  (7),  we  infer  (10). 

Theorem  XII. 2. 12.  Assume  Hypothesis  Hg.  Assume  further  that 
R(n  -{-  1)  =  j{n,R)  is  stratified.    Then 

(1)  m  e  Nn, 

(2)  a  =  n(n  e  'Nnjn  <  n), 

(3)  {R,S,n)::R,S  e  Funct.Arg(i2)  C  a.Aig(S)  Q  a.n  e  a:.(x):x  €  a.x  <  n. 

D  .Rix)  =  Six).:  D  :.Jin,R)  =  Jin,S) 
yield 
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(ER)::R  €  Funct.Arg(i2)  =  a.R(m)  =  a:.(n):n  e  a.  D  .R(n  +  1)  =  j(n,R). 

If  we  should  wish  to  prove  this  theorem  with  the  means  now  at  our  dis- 
posal, we  could  proceed  along  the  following  lines.  If  R  is  the  function  to  be 
defined,  let  Rn  denote  that  portion  of  R  which  consists  of  the  ordered  pairs 

(m,72(m)>, 

(m  +  l,R(m  +  1)>, 

(m  -\-  n,R(m  +  n)). 

We  can  build  up  the  Rn's  successively  by  the  following  scheme. 

Ro  =  {(m,a)}, 

R^  =  RoVJ  {{m+  l,j(m,Ro))}, 

R,  =  R,yj  {{m  +  2,j(m+  1^}}, 

etc., 
and  in  general 

R,^^  =  R„\J  {{m  +  n+  l,j(7n  +  n,R„))}. 

Thus  the  RnS  can  be  defined  by  weak  induction  on  n.  In  fact,  we  can 
use  Thm.XI.3.25  and  Thm.XI.3.26  to  define  a  function  /  such  that  f({n}) 
is  just  our  R„.    The  conditions  on  /  are 

(I)  /({O})  =  {(m,a)}, 

(III)        /({n  +  1})  =  mn})  KJ  {(m  +  n  +  l,j{m  +  n,f({n})))}. 

Then  in  Thm.XI.3.25  and  Thm.XI.3.26  we  take  a  to  be  {{m,a)}  and 
h{x,y)  to  be 

y\j{{m  +  x+  l,j(m  +  x,y))]. 

Clearly  the  stratification  conditions  are  satisfied,  and  so  a  function  /  is 
actually  defined  by  (I)  and  (III)  above.  As  we  wish  R  to  be  the  logical  sum 
of  all  the  R„,  we  can  now  define 

R  =  w(En).n  e  Nn.w  e  f({n}). 

We  now  turn  to  a  study  of  finite  and  infinite  classes.  We  define  a  finite 
class  as  a  class  whose  cardinal  number  is  finite,  and  an  infinite  class  as  a 
class  whose  cardinal  number  is  infinite.  This  is  achieved  by  the  following 
definitions : 

Fin  for         U(Nn) 

Infin         for         V  —  Fin. 
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Fin  and  Infin  are  stratified  and  contain  no  free  variables  and  so  may  be 
assigned  any  type. 

Theorem  XI.3.28. 
*I.  |-  («):«  e  Fin.  =  .(Ew).n  e  Nn.a  e  n. 
II.  |-  («):«  e  Infin.  =  .'^  a  e  Fin. 

Corollary  1.     \-  {a):a  e  Fin.  =  .Nc(q:)  e  Nn. 

Corollary  2.     [-  (a):a:  e  Infin.  =  .Nc(a)  e  NC  -  Nn. 

Corollary  3.     ^  K  t  Fin. 

Corollary  4.     |-(rc).{a:}  e  Fin. 

Corollary  5.     h  ^  ^  Infin. 
*Corollary  6.     \-  {a,0):a  e  Fin.a  sm  /3.  D  .,9  e  Fin. 

Corollary  7.     |-  {a,0):a  e  Infin.a  sm  13.  D  .(3  e  Infin. 

Theorem  XI.3.29. 
I.   h  (a,l3):a  C  /3.a  €  Fin.   D   .Nc(a)   <  Nc(/3). 
II.  h  (a,/3):a  C  i8./3  e  Fin.  D  .Nc(a)  <  Nc(/3). 

Proof  0/  I.     Let  a  C  i8  and  a  e  Fin.    Then  by  Thm.XI.2.13,  Part  I, 

(1)  Nc(a)  <  Nc()S) 

and  by  Thm.XI.3.28,  Cor.  1,  -  " 

(2)  Nc(q:)  €  Nn. 

Then  by  Thm.XI.2.13,  Part  III,  it  suffices  to  prove  Nc(a)  ?^  Nc(/3). 
This  we  do  by  reductio  ad  absurdum,  to  which  end  we  assume 

(3)  Nc(a)  =  Nc(/3). 

By  Ex.XI.2.8,  Nc(a  W  /3)  =  Nc(a)  +  Nc(i8  -  a).  Then  by  Thm.IX.4. 13, 
Part  III,  Nc(^)  =  Nc(a)  +  Nc(/3  -  a).  So  by  (3),  Nc(a)  =  Nc(a)  + 
Nc(/3  -  a).  So  by  (2)  and  Thm.XI.3.2,  Cor.  1,  Nc(/3  -  a)  =  0.  Hence 
)8  -  a  =  A  by  Thm.XI.2.7,  Cor.  2.  Then  /3  C  «  by  Thm.IX.4.13,  Part  II. 
However,  from  a  C  /3,  we  get  '~(/3  C  «)  by  Thm.IX.4. 19. 

Proof  of  Part  II.     Similar. 

Corollary  1.     [-  (cx,^):a  C  13. a  e  Fin.  D  .~(q:  sm  ,8). 

Proo/.     If  a  sm  )S,  then  Nc(a)  =  Nc(i8)  by  Thm.XI.2.2,  Part  II. 

Corollary  2.     \-  (a,l3):a  C  ^.^  «  Fin.  D  .~(a  sm  /3). 

Corollary  3.     \-  (a,^):a  C  /^.a  sm  ^.  D  .a  e  Infin. 
**Corollary  4.     |-  (a,(3):a  C  |S.«  sm  ^.  D  .13  e  Infin. 

This  theorem  and  its  corollaries  tell  us  that  no  finite  class  can  be  similar 
to  a  proper  subset  of  itself,  and  that  if  any  class  is  similar  to  a  proper  subset 
of  itself  then  it  is  an  infinite  class.  A  sort  of  converse,  to  the  effect  that 
every  infinite  class  is  similar  to  a  proper  subclass  of  itself  is  widely  quoted. 
We  do  not  know  how  to  prove  this  converse  except  by  making  use  of  the 
denumerable  axiom  of  choice.     As  most  mathematicians  accept  the  de- 
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numerable  axiom  of  choice  without  the  sHghtest  hesitation,  they  are  able 
to  conclude  that  a  class  is  infinite  if  and  only  if  it  is  similar  to  a  proper  subset 
of  itself.    However,  for  the  present,  we  have  only  Thm.XI.3.29,  Cor.  4. 

Theorem  XI.3.30.     \-  Nn  srn  Nn  -  {0}. 

Proof.     Define  7^  =  Nn1X.r(.T  +1).    Clearly 

(1)  h  Arg(/^)  =  Nn. 
Also,  by  Thm.XI.2.12,  corollary, 

(2)  ^R  e  1-1. 
By  Thm.X.1.2  and  Thm.X.1.5, 

(3)  .  lYa.\{R)  C  Nn  -  {0}. 
By  Thm.X.1.7, 

(4)  [-Nn  -  {01  C  Val(R). 

Hence  our  theorem  follows  by  Thm.XI.1.1,  Part  IV. 

Corollary  1.     |-  Nn  e  Infin. 

Proof.  By  Thm.IX.4.21,  [-  Nn  -  {0}  C  Nn.  So  we  use  Thm.XI.3.29, 
Cor.  4. 

Corollary  2.     \-  Nc(Nn)  e  NC  -  Nn. 

Corollary  3.     \-  {n):n  e  Nn.  D  .n  <  Nc(Nn). 

Proof.     Use  Thm.XI.3.6,  Cor.  3. 
**Theorem  XI.3.31.     h  (a,f3):a  C  /3.^  e  Fin.  D  .a  e  Fin. 

Proof.  Assume  a  C  /3  and  /3  e  Fin.  Then  by  Thm.XI.2.13,  Part  I, 
Nc(a)  <  Nc(^)  and  by  Thm.XI.3.28,  Cor.  1,  Nc(/3)  e  Nn.  So  by  Thm. 
XI.3.3,  Nc(q:)  €  Nn.    Hence  a  e  Fin. 

Corollary  1.     [-  {a,^):a  C  ^,a  e  Infin.  D  .13  e  Infin. 

Corollary  2.     \-  {a,0):a  e  Fin.  D  .a  n  /3  e  Fin. 

Corollary  3.     |-  (a,/3):a  n  /3  e  Infin.  D  .a  e  Infin. 

Corollary  4.     ^  (a,/3):a  W  (8  e  Fin.  D  .a  e  Fin. 

Corollary  5.     \-  i(x,(3):a  e  Infin.  D  .a  VJ  ^  e  Infin. 

Theorem  XI.3.32.     \-  {a,^):a,^  e  Fin.  D  .a  ^  (3  e  Fin. 

Proo/.  Let  a,^  e  Fin.  Then  by  Thm.XI.3.31,  Cor.  2,  (3  -  a  e  Fin.  So 
Nc(a)  e  Nn,  Nc(/3  -  a)  e  Nn.  Hence  by  Thm.X.1.14,  (Nc(a)  +  Nc(/3  - 
a))  e  Nn.    Then  by  Ex.XI.2.8,  Nc(a  W  /3)  e  Nn.    So  a  W  ^  e  Fin. 

Corollary  1.     |-  {a,^):.a  W  /3  e  Infin:  D  :a  e  Infin. v./3  e  Infin. 

Corollary  2.     [-  (a,/5):a  W  /3  e  Infin.«  e  Fin.  D  .^  €  Infin. 

Corollary  3.     \-  {a,l3):a  e  Infin.,8  e  Fin.  D  .a  —  13  e  Infin. 

Proo/.  If  a  e  Infin,  then  a  W  /3  e  Infin  by  Thm.XI.3.31,  Cor.  5.  So  by 
Thm.IX.4.4,  Part  XIX,  (a  -  /3)  W  ^3  e  Infin.  If  /3  e  Fin,  then  a  -  13  e  Infin 
by  Cor.  2. 
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Corollary  4.     \-  {a,x):a  e  Infin.  D  .a  —  [x]  e  Infin. 
Theorem  XL3.33.     h  i^)'^  «  Fin.X  C  Fin.  D  .\J^  t  Fin. 
Proof.     We  prove  by  weak  induction  on  n  that 

(1)  |-  (X,n):n  e  Nn.X  e  n.X  C  Fin.  D  .U^  e  Fin. 

Let  Fix)  be 

(X):X  e  .T.X  C  Fin.  D  .IJX  e  Fin. 

By  Ex.IX.5.1,  Part  (b),  and  Thm.XI.3.28,  Cor.  3,  (X):X  e  0.  D  .\J\  e  Fin. 
So 

(2)  V  no). 

Assume  n  e  Nn,  F(7^),  and  X  e  n  +  1,  X  C  Fin.  Then  by  Thm.X.1.16, 
^  e  n.'^  a  e  fi.X  =  /x  W  {a}.    Then  ^  ^  X,  so  that  /x  C  Fin,  so  that  by  F(n), 

(3)  Um  e  Fin. 
Also  Q!  e  X,  so  that 

(4)  a  e  Fin. 

By  Thm.IX.6.9,  Part  VIII,  \J\  ^  a  yj  Um-  Then  by  (3)  and  (4)  and 
Thm.XI.3.32,  \J^  e  Fin.    Thus  we  have 

n  e  Nn,  F(n)  f-  F(n  +  1). 

So  by  (2),  we  conclude  (1).  From  (1)  by  a  simple  transformation,  we 
get 

(X):.(Ew).n  e  Nn.X  e  n-.X  Q  Fin:  D  :UX  e  Fin. 

Our  theorem  now  follows  by  Thm.XI.3.28,  Part  I. 

This  theorem  tells  us  that  the  logical  sum  of  any  finite  number  of  finite 
classes  is  finite. 

Theorem  XI.3.34.     \-  (a,n):n  e  Nn.a  e  Infin.  D  .(E/3).^  e  n./3  C  «■ 

Proof.  Assume  n  e  Nn,  a  e  Infin.  Then  Nc(q;)  e  NC  —  Nn.  So  by 
Thm.XI.3.6,  Cor.  3,  n  <  Nc(a).  Then  by  Thm.XI.2.22,  Nc(a)  -  n  +  p. 
So  a  e  n  +  p.  Then  /3  e  w.7  e  p./3  H  7  =  A./3  U  7  =  a.  So  /3  C  «.  If  /3  =  a, 
then  n  =  Nc(a)  by  Thm.XI.2.2,  Part  IV.    So  /3  ?^  a,  and  /3  C  «• 

This  theorem  tells  us  that  from  any  infinite  class  we  may  extract  as  large 
a  finite  class  as  we  wish.  Incidentally,  by  Thm.XI.3.32,  Cor.  3,  what 
remains  is  still  infinite. 

Theorem  XI.3.35.     \-  {a,^):a,^  e  Fin.  D  .a  X  /5  e  Fin. 

Proof.  Let  a,/3  e  Fin.  Then  Nc(a),Nc(/3)  e  Nn.  So  by  Thm.XI.3.8, 
(Nc(a)  X  Nc(/3))  €  Nn.  Then  by  Thm.XI.2.27,  Nc(a  X  ^)  e  Nn.  So 
a  X  iS  e  Fin. 

Theorem  XL3.36.     \-  {a,0):a  X  /3  e  Fin.  D  ./3  X  a  e  Fin. 
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Proof.     Use  Thm.XI.1.17  and  Thm.XI.3.28,  Cor.  6. 

Theorem  XI.3.37.     \-  (a,^):/3  f^  A.a  X  Z?  e  Fin.  D  .a  e  Fin. 

Proof.  If  /3  5^  A  and  a  X  iS  e  Fin,  then  Nc(/3)  ?^  0  by  Thm.XI.2.7, 
Cor.  2,  and  (Nc(a:)  X  Nc(^))  e  Nn  by  Thm.XI.3.28,  Cor.  1,  and  Thm. 
XL2.27.    So  by  Thm.XI.3.9,  Nc(«)  e  Nn. 

Corollary,     f-  {a,0):a  ^  A.a  X  13  e  Fin.  D  ./3  e  Fin. 

Theorem  XI.3.38.     \-  («).-«  e  Fin.  D  .USC(«)  e  Fin. 

Proof.     We  prove  by  weak  induction  on  n  that 

(1)  [-  (a,7i):n  e  Nn.«  €  n.  D  .USC(a)  e  Fin. 

We  let  F(x)  be  («):«  e  a;.  D  .USC(a)  e  Fin.  By  Thm.IX.6.12,  Part  IV, 
and  Thm.XI.3.28,  Cor.  3, 

(2)  h^(0). 

Assume  n  e  Nn,  F{n),  and  a  e  n  +  I.  Then  by  Thm.X.1.16,  /3  e  n. 
-^  a;  €  ^.a  =  /3  W  {a:}.    By  F(n),  USC(/3)  e  Fin.    So  by  Thm.XI.3.28, 

(3)  me  Nn, 

(4)  USC(/3)  e  m. 
By  Thm.IX.6.10,  Part  II, 

(5)  ~{x}  €USC(/3). 

So  by  (4),  (5),  and  Thm.X.1.16,  USC(/3)  yj  {{x}}  e  m  +  1.  Then  by 
(3),  USC(/3)  W  {{x}}  e  Fin.  However,  by  Thm.IX.6.12,  Part  II,  and 
Thm.IX.6.16,  USC(a)  =  USC(/3)  W  {{.v}}.  So  USC(«)  e  Fin.  Thus  we 
have  shown 

n  e  Nn,  F{n)  \-  F(n  +  1). 

So  by  (2),  we  conclude  (1).    By  Thm.XI.3.28,  our  theorem  follows  from 

(1). 

Theorem  XI.3.39.     h  («):USC(a)  e  Fin.  D  .a  e  Fin. 

Proof.  Since  h  1  «  Nn,  we  have  [-  1  C  Fin  by  Thm.IX.5.5,  Part  II,  and 
the  definition  of  Fin.  So  by  Thm.IX.6.12,  Cor.  2,  |-  USC(a)  C  Fin.  Then 
by  Thm.XI.3.33, 

h  USC(a)  6  Fin.  D  .U(USC(a))  e  Fin. 

Our  theorem  follows  by  Thm.IX.6.11. 

Corollary  1.     \-  («):«  e  Fin.  =  .USC(a;)  e  Fin. 

Corollary  2.     ^  («):«  e  Infin.  =  .USC(a)  e  Infin. 

Theorem  XI.3.40.     |-  {ct,l3):a,l3  e  Fin.  D  .(a  /V  ^)  e  Fin. 

Proof.  Let  a,/3  e  Fin.  Then  USC(a),USC(/?)  e  Fin.  So  Nc(USC(a)), 
Nc(USC(/3))  €  Nn.  Then  by  Thm.XI.3.13,  (Nc(USC(a))  Ac  Nc(USC(^))) 
€  Nn.    Then  by  Thm.XI.2.39,  Nc(a  A  /S)  e  Nn. 
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Theorem  XI.3.41.     [-  («):«  e  Fin.  D  .SC(a)  e  Fin. 

Proof.  Let  a  e  Fin.  Let  A  =  {A,V}.  Then  M  e  2.  So  ^  ^  e  Fin. 
Then  by  Thm.XL3.40,  {a /^  A)  e  Fin.  However,  \-  SC(q:)  sm  (a  A  ^)  by 
Thm.XLL28.    So  SC(a)  e  Fin  by  Thm.XL3.28,  Cor.  6. 

Theorem  XI.3.42.     \-  (a):SC(«)  e  Fin.  D  .«  e  Fin. 

Proo/.  Let  SC(a)  e  Fin.  Then  USC(a)  e  Fin  by  Thm.XL3.31  and 
Thm.IX.6.18,  Cor.  L    So  a  e  Fin  by  Thm.XL3.39. 

Corollary  1.     \-  («):«  e  Fin.  =  .SC(a)  e  Fin. 

Corollary  2.     |-  {a):a  e  Infin.  =  .SC(q;)  e  Infin. 

Theorem  XL3.43.     \-  {a,n):n  e  Nn.a  e  n.  D  .USC^(a)  sm  x(x  e  Nn.x  <  n). 

Proof.  Proof  by  weak  induction  on  n.  Let  F(n)  denote  (a):a  e  n.  D  . 
USC'(a)  sm  x(x  e  Nn.x  <  n).  By  Thm.XL2.15,  Cor.  2,  and  Thm.XL2.20, 
Cor.  2,  we  get 

\-  x(x  e  Nn.a:  <  0)  =  A 

by  Thm.IX.4.12,  Part  II.    Thence  we  get 

(1)  h^(0) 

by  Thm.IX.6.12,  Part  IV.     Assume  n  e  Nn,  F(n),  and  a  e  n  +  1.     By 
Thm.X.1.16, 

(2)  13  €  n.~  z  e  I3.a  =  0  U  {z}. 
From  F{n),  we  get 

(3)  USC'(/3)  sm  x{x  e  Nn.o:  <  n). 

By  (2)  and  Thm.IX.6.6,  Cor.  1,  /3  n  {z\  =  A.  So  by  Thm.IX.6.12, 
Parts  I  and  IV, 

(4)  USC'(/5)  nUSC'({2})  =  A. 
By  (2)  and  Thm.IX.6.12,  Part  II, 

(5)  ■  USC'(«)  =  USC'(/3)  w  USC'({2}). 

By  Thm.XL2.13,  Part  III,  |-  '— (n  <  n).  So  \-  '^  n  e  .t(.t  e  Nn..T  <  n). 
Then  by  Thm.IX.6.6,  Cor.  1, 

(6)  x(x  €  Nn..T  <  n)  r\  [n]   =  A. 

By  Thm.IX.6.16,  USC'({2})   =    \{{z]}].    So  by  Thm.XI.1.10, 

(7)  hUSC'({z})  sm  [n]. 

Since  n  e  Nn,  we  have  by  Thm.XI.2.14,  Cor.  2,  and  Thm.XL3.5,  Cor.  3, 


Sec.  3]  CARDINAL  NUMBERS  423 

y  €  x(x  e  Nn.a'  <  n)  vj  {n}: 

=  :?/  e  Nn.y  <  n.v.y  =  n-. 

^  :y  e  Nn.i/  <  7i.y.y  e  ISn.y  =  n-. 

=  :?/  e  Nn.?/  <  72: 

=  :y  e  Nn.?/  <  ?^  +  1: 
^  :y  e  x(x  e  Nn.a:  <  n  +  1). 
So 

(8)  x(a;  e  Nn.a;  <  n)  W  {n}   =  x(x  e  Nn..T  <  n  +  1). 

By  (3),  (7),  (4),  (6),  and  Thm.XI.1.12,  corollary, 

(USC'(/3)  W  USC'({2}))  sm  (.f(.T  €  Nn.x  <  n)  w  \n}). 

So  by  (5)  and  (8), 

USC^(o!)  sm  x(a:  e  Nn.x  <  7i  +  1). 

Corollary  1.     |-  {n):n  e  Nn.  D  .x(x  e  Nn.x  <  /i)  e  Fin. 

Proof.  Let  n  e  Nn.  Then  by  Thm.XI.2.2,  Part  VI,  a  e  w.  Then  by 
Thm.XI.3.28,  Part  I,  a  e  Fin.  So  by  Thm.XI.3.38,  USC'(«)  e  Fin.  Thence 
by  Thm.XI.3.28,  Cor.  6,  x{x  e  Nn.x  <  n)  e  Fin. 

Corollary  2.     |-  (n):/i  e  Nn.  D  .i(a;  e  Nn..c  <  n)  e  Fin. 

Proof.     If  w  €  Nn,  then 

x{x  e  Nn.a:  <  n)  =  x(a;  e  Nn.a;  <  7i  +  1) 

by  Thm.XI.3.5,  Cor.  3.  However,  n  +  1  e  Nn,  so  that  x(a;  e  Nn..T  <  n  -^  1) 
e  Fin  by  Cor.  1. 

We  have  seen  (Thm.XI.3.21)  that  each  nonempty  set  of  finite  cardinals 
has  a  least  member.  However,  there  are  nonempty  sets  of  finite  cardinals 
with  no  greatest  member;  for  example,  Nn.  For  finite  nonempty  sets,  the 
situation  is  different.  Each  nonempty  finite  set  of  finite  cardinals  has  a 
greatest  member  as  well  as  a  least. 

This  is  just  a  special  case  of  the  general  result  that  each  nonempty  finite 
subset  of  a  simply  ordered  set  must  have  both  a  least  and  greatest  member. 
Even  more  generally,  each  nonempty,  finite  subset  of  a  partially  ordered 
set  must  have  a  minimal  element  and  a  maximal  element.  We  now  prove 
this. 

Theorem  XI.3.44.  \-  {a,R):R  e  Pord.a  e  Fin.a  ^  A.a  C  AV(R).  D  . 
(Ex) .a;  min^  a. 

Proof.     We  prove  by  weak  induction  on  n  that 

(1)         {a,R,n):R  e  Pord.n  e  Nn.«  en  +  l.a  e  AV(R).  D  .(E.r).a:  min^  a. 

We  take  F(n)  to  be  {a,R):R  e  Pord.a  e  n  +  l.a  C  AV(R).  D  .(Ex). 
X  minp  a. 
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Assume  R  e  Pord.a  el.aQ  AY(R).  Then  a  =  {y}.  Then  by  the  defini- 
tion of  the  term,  we  readily  get  y  min/j  a.    Thus  we  infer 

(2)  1-^(0). 

Assume  n  e  Nn,  F(n),  and  R  e  Pord.a  e  (n  +  1)  +  l.a  Q  AV(R). 
Then  0  e  n  +  l.~  y  e  jS.a  =  0  \J  {y}.  Then  /3  C  a,  so  that  /?  C  AV(i2), 
so  that  by  F{n), 

(3)  z  miujj  j8. 

Case  1.  ?/i?2.  Then  ?/min;j  a,  for  clearly  1/ e  a  n  AV(72),  and  if  w  e  a  n 
AV(72),  then  ~(w(i2  -  I)y),  for  if  w(i2  -  r)y,  then  w;(/2  -  /)3  by  Thm. 
X.6.5,  Part  VII,  and  w  e  /3  n  AY{R)  by  a  -=  /3  W  {?/},  and  we  have  a  con- 
tradiction w^ith  (3). 

Case  2.     ^{yRz).    Then 

(4)  ~(y(i2  -  I)z). 

Then  2  min^  a,  iov  z  e  a  r\  AY{R),  and  if  w  e  «  n  AV(72),  then  either 
w  e  |8,  so  that  ^iw(R  -  I)z)  by  (3),  or  else  w  ^  y,  so  that  ^{w{R  -  7)2) 
by  (4). 

Thus  we  have  established  (1).  Now  assume  R  e  Pord.a  e  Fin. a  ?^  Aj 
a  C  AV(i^).  Then  by  Thm.XI.3.28,  m  e  Nn.a  6  ?72.  Since  a  9^  A,  7n  9^  0. 
HencebyThm.X.1.7,  n  eNn.w  =  n  +  1.  Soaen+l.  Then  by  (1),  our 
theorem  follows. 

Corollary  1.  \-  (a,R):R  e  Pord.a  e  Fin.a  5^  A.a  C  AV(i?).  D  .(Ex). 
re  maXfl  a. 

Proof.     Use  Thm.X.6.5,  Part  V. 

Corollary  2.  [-  («,^):^  ^  Sord.a:  e  Fin.a  5^  A.a  C  AV(7?).  D  .(Ex). 
X  least jj  a. 

Proof.     Use  Thm.X.6.10,  Part  I. 

Corollary  3.  \-  ia,R):R  e  Sord.«  e  Fin.a  7^  A.a  Q  AY(R).  D  .(Ex). 
X  greatestjj  a. 

"*'*Corollary  4.     |-  {a):a  Q  Nn.a  e  Fin.a  9^  A:  D  :(En):n  e  a:{m).m  e  a  D 
m  <  71. 

Proof.     Take  R  to  be  Nn^  <  ,fNn  in  Cor.  3. 

Theorem  XL3.45.  \-  {a):ia  Q  Nn:.(n):/i  e  Nn.  D  .(Ex).x  e  a..T  >  n.:  D  :. 
a  €  Infin. 

Proof.     By  Thm.XI.3.44,  Cor.  4, 

(1)  |-  (a)::a  Q  Nn:ct  9^  A:.~(En):n  e  a:{m).m  e  a  D  m  <  n.-.  D  :.(x  e  Infin. 
Now  assume 

(2)  a  ^  Nn 

(3)  {n):n  e  Nn.  D  .(E.t)..t  e  a.x  >  n. 
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By  putting  n  =  0  in  (3),  we  get  a  9^  A.  Also  from  (3)  by  duality  and 
Thm.XI.3.7,  Cor.  2,  ~(E?i):^  e  a:{m).m  e  a  D  7n  <  n.  So  by  (1),  a  e  Infin, 
and  we  infer  our  theorem. 

An  early  use  of  this  theorem  was  Euclid's  proof  that  there  are  infinitely 
many  primes.  For  given  any  n,  the  prime  divisors  of  (n!)  +  1  must  be 
greater  than  n,  so  that  there  is  a  prime  greater  than  n.  Hence  by  the 
theorem  above,  the  set  of  primes  is  infinite. 


EXERCISES 


XI.3.1.     Prove: 


(a)  |-  {m,n):.m  e  Nn.n  e  NC:  D  -.n  <  m.  =  .(Ep).p  e  Nn.w  =  n  +  p. 

(b)  f-  (m,n):.m  e  Nn.w  e  NC:  D  :n  <  m.  =  .(Ep).p  e  Nn.w  =  n  +  p  +  I. 

XI.3.2.     Prove: 

(a)  |-  (m,n):m,n  e  NC.m  ^  O.n'"  e  Nn.  D  .n  e  Nn. 

(b)  |-  {m,n):m,n  e  NC.2  <  n.n"  e  Nn.  D  .m  e  Nn. 

(c)  [-  (m,n):m,n  e  NC.w  5^  0.2  <  n.n"*  e  Nn.  D  .m,n  e  Nn. 

XI.3.3.  Prove  |-  (n):n  e  Nn.  D  .x(a;  e  Nn.x  <  n)  sm  x(x  e  Nn.a:  ?^  0. 
X  <  w). 

XI.3.4.  Prove  that,  if  F(x)  is  stratified,  then  F(0),F(l),(n):n  e  Nn. 
F(w).  D  ./^(n  +  2)  h  (^i):'i  e  Nn.  D  .F(n). 

XI.3.5.  Expose  the  flaw  in  the  following  incorrect  proof.  We  wish  to 
show  that  all  members  of  any  finite  nonempty  class  a  are  the  same.  We 
prove  this  by  weak  induction  on  the  number  of  members  of  a.  When  a  has 
a  single  member,  the  result  is  obvious.  Assume  the  result  for  all  as  with  n 
members,  and  let  a  have  n  +  1  members.  Removing  the  last  member  of 
a,  the  remaining  n  members  must  all  be  the  same  by  our  assumption. 
Similarly,  upon  removing  the  first  member  of  a,  the  remaining  n  members 
must  all  be  the  same.  Hence  the  last  member  must  be  the  same  as  the 
first  m. 

XL3.6.  Prove  that,  if  a  and  b  are  specified  constants  and  h(x,y,z)  is  a 
specified  function  value  of  x,  y,  and  z  such  that  x  =  y  =  z  =  h(x,y,z)  is 
stratified,  then  there  is  a  unique  /  such  that 

/  €  Funct, 

Arg(/)  =  Nn, 

/(O)  =  a, 

/(I)  =  h, 

(n):n  €  Nn.  D  .f(n  +  2)  =  h{n,f(n),f(n  +  1)). 
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XI.3.7.     Prove  h  Nn^  <,  fNn  e  Word. 

XI.3.8.     Writing  m  —  n  for  tp  (p  e  NC.n  +  P  =  w),  prove: 

(a)  \-  {m,n):m  e  NC.n  e  Nn.n  <  m.  D  .(m  —  n)  +  n  =  m.m  —  n  e  NC. 

(b)  [-  (m,7i,p):m,p  e  NC.n  e  Nn.n  +  p  =  ?n.  D  .p  =  m  —  n. 

(c)  |-  {m,7i)vyn  e  NC.n  e  Nn.  D  .(m  +  7i)  —  n  =  m. 

(d)  |-  (7n,n,p):m,n  e  NC.p  e  Nn.p  <  7i.  D  .(??^  +  w)  —  p  =  m  +  (n  —  p). 

(e)  |-  (m,n,p):m  e  NC.n,p  e  Nn.n  <  m  -\-  p.p  <  n.  D  .m  —  (n  —  p)  = 

(m  -\-  p)  —  n. 

(f)  |-  (m,n,p):7n  e  NC.w,p  e  Nn.p  <  ??i.  D  .(w  +  n)  —  (n  +  p)  =  m  —  p. 

(g)  |-  (7n,7i,p):m  e  NC.n,p  e  Nn.n  +  p  <  ?n.  D  .m  —  (n  +  p)  =  (m  —  n) 

-  P- 

(h)     \-  (m,7i):77i  €  NC.n  e  Nn.n  <  ?n.  D  jn  —  ti  <  tu. 

(i)      |-  {m,7i,p):m  e  NC.n,p  e  Nn.n  <  7?i.    D    .(7n  X  p)   —   (n  X  p)    = 

(??i  —  n)  X  p. 
(J)     h  {'>n):m  €  Nn.  D  .?n  —  w  ==  0. 

XI.3.9.    Writing  Q{m  -^  n)  for  tq  (q  e  NC.n  X  q  <  mjn  <  n  X  (?  +  1)), 
and  /?(m  ^  n)  for  7n  —  {n  X  Q{m  -f-  n)),  prove: 

(a)  |-  (771,71) :7n, 71  €  Nn.n  ?^  0.  D  .Q(m  -^  7i)  e  Nn.n  X  Q(m  -^  n)  <  7n. 

Tti  <  71  X  (Q{>n  -^  n)  +  1). 

(b)  |-  {tu, 71, q) -.771,71  e  Nn.g-  e  NC.n   X  g   <   m.m   <  n  X    (q  -\-   1).    3    . 

g  =  Q(7n  4-  n). 

(c)  \-  (m,7i):m,7i  e  Nn.n  5^  0.  D  .i?(m  4-  n)  e  Nn.m  =  (n  X  Q(m  -f-  n))  + 

i^(m  -h  n). 

(d)  |-  {7n,7i):7n,7ri  e  Nn.n  5^  0.  D  .0  <  i^(m  -^  7i).R{7n  -=-  n)  <  ?i. 

(e)  1"  (771,71) :7n, 71  e  Nn.n  5^  0.  D  .Q((m  X  n)  -^  n)  =  77i.R((7n  X  ti)  -^  ti)  =0. 

(f)  |-  (m):7n  e  Nn.m  ?^  0.  3  .Q(m  -:-  ?n)  =  1. 

(g)  [-  (7n,7i):m,7i,p  e  Nn.n  5^  O.R(7n  ^  n)  =  0.  D  .Q((m  +  p)  -f-  n)  = 

Q(m  -=-  n)  +  Q(p  -h  n). 
(h)     j-  (m,7i,p):77i,7i,p  e  Nn.n  5^  O.p  <  m.  D  .n"'"'  =  Q((ti^)  ^  (ti^))- 
(i)      [-  (m, 71, p):m, 71, p  e  Nn.n  5^  0.??i  =  7i  X  p.  D  .p  =  Q(m  4-  n). 

XI.3.10.     Let  us  define 

by  weak  induction  on  n,  according  to  the  conditions 

j:f(x)  =  f(7n), 

x  =  m 

i:/(-t')  =  /(n  + 1)  +  iiM- 
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Prove  that,  '\i  x  =  f{x)  is  stratified,  then: 

V  n 

(a)  \-  (m,n,p):m,n,p  e  Nn.m  <  n.n  <  p.  D  .  ^  f(x)  =   ^  f(x) 

+    Z    fix). 

x  =  n  +  l 

n  n  +  p 

(b)  |-  (m,n,p):ni,n,p  e  Nn.m  <  n.  D  .  ^  f(x)  =    ^    /(re  —  p). 

(c)  |-  (mjn,p)::mjn  e  Nn.p  e  NC.7?i  <  ?i:,(a:):X  e  Nn.w  <  a:.x  <  n.  D 

J(x)  e  NC:  D   :.p  X  (  Z  fix))  =    E  (P  X  /(x)). 

XI.3.11.     Let  us  define  n\  by  weak  induction  on  n,  according  to  the 
conditions 

0!=  1, 

(n+  1)!  =  (n+  1)  X  (?^!). 
Let  us  define 

(^)         as         Q((n!)  ^  ((w!)  X  ((n  -  m)!))). 
Prove : 

(a)  h(n):n6Nn.  D  Q  =  \. 

(b)  (-  {m,n):m,n  e  Nn.m  <  n.  D  .(m!)  X  ((n  —  m)!)  X    (      )  =  n! 

{Hint.     Use  weak  induction  on  n.    If  (b)  is  assumed  for  n  and  if 
m  <  n,  then 

((«  +  l)!)X((n-m).)x(("J  +  (^;j)) 

=  ((m  +  1)  X  (n!))  +  ((n  -  m)  X  (w!)) 
=  (n+1)!) 

(c)  h  (-,n):m,n  e  Nn.m  +  1  <  n.  D  .(;)  +  (^  ;  J  =  (^^^^  j). 

(Hint.     Use  the  same  algebraic  relation  as  in  the  hint  for  (b).) 

(d)  h  ix,y,n):x,y  c  NC.n  e  Nn.  D  .(a;  +  2/)"  =   Z  T  ).tV"'". 

(Hint.     Use  weak  induction  on  n.) 
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XL3.12.     ProveThm.XII.2.12.    {Hint.    Define  *S  by 

^({0})  =  {{m,a)], 
{n):n  e  Nb.  D  .S({7i  +  1})  =  Si{n})  W  {<m  +  n  +  l,j(m  +  n,S({n})))} 
(see  Thm.XI.3.26).    Also  define 

R  =  w(En).n  e  Nn.w  e  /S({n|). 
Assuming  the  hypothesis  of  Thm.XII.2.12,  prove  the  following  lemmas: 

1.  {n):n  e  Nn.  D  .Arg(/S'({n}))  =  x{x  e  Nn.m  <  x.x  <  m  -\-  n). 

2.  (n.):neNn.  D  .»S({n})  e  Funct. 

3.  (7i):^  €  Nn.  D  :m(S{{n}))a. 

4.  {n,p):n,p  e  Nn.  D  .(/n  +  w  +  l)(^({n  +  p  +  l}))j(m  +  n,*S({n})). 

5.  {ij):mRij.  =  ,y  =  a. 

6.  (/i):.n  e  Nn:  D  :(2/):(m  +  n  +  1)%.  =  .2/  -=  i(w  +  n,^({n})). 

7.  Ai'g(i^)  =  a. 

8.  7^  €  Funct. 

9.  R{m)  =  a. 

10.  {n):.n  e  Nn:  3  :{x):X  e  a.x  <  m  +  n.  D  .R(x)  =  (S({n]))(x). 

11.  (n):n  e  a.  D  .R{n  +  1)  =  i(n,7?).) 

XI.3.13.  If  a,  ^i,  jSg,  .  .  .  are  as  in  the  discussion  between  Thms.IX.5.15 
and  IX. 5. 16,  show  how  to  define  a  function  R  such  that 

R{0)  =  a 

R{1)  =  /3i 

etc. 

With  this  R,  prove  [-  Clos(a,P)  =  \J{R(n)  \  n  e  Nn|.    Explain  why  this 
may  be  interpreted  as 

Clos(«,P)    =   a  W  ^1  W  jSa  W  •  •  •  . 

XI.3.14.  Prove  |-  («)::«  ^  Nn.:  D  :.{n):n  e  Nn.  D  .(E.r)..-);  e  a.a;  >  n:  =  : 
(ri):ri  e  Nn.  D  .(E.r).x  e  a. a;  >  n. 

XI.3.15.  Give  an  intuitive  proof  that  each  real  number  x  with  0  <  .t  <  1 
can  be  uniquely  expanded  in  a  nonterminating  ternary  expansion  (every 
digit  of  the  expansion  is  either  a  0,  1,  or  2)  with  no  digits  to  the  left  of  the 
ternary  point. 

XI.3.16.     Prove: 

(a)  |-  (X):UA  €  Fin.  =  .X  e  Fin.X  C  Fin. 
{Hint.     Show  that  h  X  C  SC(UX).) 

(b)  y  {a)::a  e  Infin.:  =  :.(n):W  e  Nn.  D  .(E/3)./3  C  a.^  e  n. 
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Throughout  the  remaining  exercises,  assume  the  "four"  Hausdorff 
axioms,  and  use  the  notation  of  Sec.  8  of  Chapter  IX. 

XI.3.17.     Prove: 

(a)  (X):X  e  Fin.X  C  OS.  D   .flA  e  OS. 

(b)  (X):X  e  Fin.X  C  CS.   D   .IJA  e  CS. 

(Hint.     Proceed  as  in  the  proof  of  Thm.XI.3.33.) 

XI.3.18.  Prove  (a,x):~  ■'^  ^  «■«  e  Fin.  D  .(E/3)./3  e  H(x).a  r\  (3  =  A. 
(Hint.    Use  weak  induction  on  Nc(q!).) 

XI.3.19.  Prove  (a,x):.x  e  «':  =  :(/3):.r  6  13.13  e  OS.  D  .^  r^  a  e  Infm. 
(Hint,  li  X  €  a' ,  X  e  13.(3  e  OS,  and  ^  r\  a  e  Fin,  choose  ji  with  71  e  //"(.r). 
7i  ^  /?•    Then  a  n  (71  —  {x})  e  Fin.    So  by  the  preceding  exercise,  there  is  a 

72  with  72  e  H(x).y2  n  (a  n  (ji  —   {x}))  ^  A.    Choose  73  with  73  e  F(a;). 

73  ^  7i  '^  72-  Then  x  e  73.73  e  08.(73  —  {x})  n  a  =  A,  contradicting 
X  e  a'.) 

XI.3.20.     Prove  (a):a  e  Fin.  D  .a    =  A. 

We  say  that  a  set  of  sets,  X,  covers  a  set,  a,  if  «  C  (Jx. 

We  say  that  a  Hausdorff  space  is  compact  if,  whenever  X  is  a  set  of  open 
sets  which  covers  S,  then  it  has  a  finite  subset  which  covers  2.  This  would 
be  expressed  in  symbols  as  follows : 

(X):X  C  OS.UX  =  S.  3  .(E/x).Ai  C  X.ju  e  Fin.Ui"  =  S. 

XI.3.21.  Prove  that  a  Hausdorff  space  is  compact  if  and  only  if  f\\  9^  A 
whenever  X  is  a  set  of  closed  sets  and  Hm  ^  A  for  every  finite  subset  of  X. 
That  is, 

(X):X  C  OS.UA  =   S.   D   .(EAt).M  ^  X.M  e  Fin.ljM  =   2::  =  ::(X)::X  C  CS:. 
(m):m  e  Fin./i  C  X.  D  .flM  ?^  A.:  D  r.fl^  ?^  A. 

XI.3.22.  Prove  that,  if  a  Hausdorff  space  is  compact,  then  every  infinite 
set  has  a  limit  point.    That  is, 

(X):X  C  OS.UX  =  S.  D   .(Em).ju  e  X./x  €  Fin.LJAi  =  S.:   D  :.(«):«  e  Infin. 
D  .a'  9^  A. 

(Hint.  Let  the  space  be  compact  and  assume  a  e  Infin  and  a'  =  A.  Put 
X  =  3(/3  e  OS.^  n  a  e  Fin).  By  Ex.XI.3.19,  U^  =  S.  So  //  C  X./z  e  Fin. 
Um  =  S.  SoanljM  =  «-  However,  en  n  Um  =^  U{«n7  I  7  e^l-  Since 
ju  e  Fin,  we  get  {a  n  7  j  7  e  At}  e  Fin.  Also,  since  /x  C  3(/3  e  0S./3  n  a  e  Fin), 
we  get  [a  r\  y  \  y  e  p.\  ^  Fin.    So  a  e  Fin,  and  we  have  a  contradiction.) 

4.  Denumerable  Classes.  We  shall  define  a  denumerable  class  as  one 
which  can  be  put  into  one-to-one  correspondence  with  the  nonnegative 
integers.    That  is,  we  say  that  a  is  denumerable  if  and  only  if  a  sm  Nn. 
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We  say  that  a  is  countable  if  a  is  either  finite  or  denumerable.    Specifically, 
we  define 

Den  for        Nc(Nn) 

Count        for        Fin  \J  Den. 

Both  Den  and  Count  are  stratified  and  have  no  free  variables,  and  hence 
may  be  assigned  any  type.  It  will  be  noted  that  Den  is  a  cardinal  number. 
When  thinking  of  it  in  this  sense,  it  is  standard  mathematical  usage  to 
denote  it  by  Ko-  As  this  is  often  very  awkward  typographically,  we  shall 
use  the  simpler  Den  in  all  cases. 

From  an  intuitive  point  of  view,  a  class  a  is  denumerable  if  and  only  if  its 
members  can  be  arranged  in  a  well-ordered  linear  sequence  with  a  first 
member  but  no  last  member,  and  such  that  every  member  but  the  first  has 
an  immediate  predecessor  in  the  sequence.  For  if  they  can  be  so  arranged, 
we  can  attach  a  subscript  zero  to  the  first,  a  subscript  unity  to  the  next,  a 
subscript  "two"  to  the  next,  etc.  We  then  have  set  up  a  one-to-one  corre- 
spondence between  the  members  of  a  and  of  Nn.  Conversely,  given  such  a 
pairing  of  the  members  of  a  and  Nn,  we  can  arrange  the  members  of  a  in  a 
sequence  by  writing  down  first  the  one  paired  with  zero,  then  the  one  paired 
with  unity,  then  the  one  paired  with  "two",  etc.  In  intuitive  mathematics, 
one  often  puts  the  above  ideas  in  evidence  by  referring  to  a  as  consisting  of 

Xof  Xi,  X2,  .... 

In  the  symbolic  logic,  we  do  not  have  available  this  idea  of  arranging  the 
members  of  a  set  in  a  linear  sequence.  Instead,  we  have  to  use  the  notion 
of  the  set  being  similar  to  Nn.  This  is  adequate  for  the  purpose,  as  we 
shall  show  by  deriving  the  familiar  theorems  on  denumerable  sets. 

If  a  set  is  merely  countable,  one  can  still  arrange  the  members  in  a  se- 
quence, but  the  sequence  may  terminate,  whereas  if  the  set  is  denumerable, 
the  sequence  must  be  nonterminating. 

The  usage  which  we  have  adopted  for  "denumerable"  and  "countable"  is 
not  universal.  Many  mathematicians  treat  "denumerable"  and  "count- 
able" as  synonymous.  Some  of  these  mathematicians  use  both  words  in 
our  sense  of  "denumerable,"  and  some  use  both  words  in  our  sense  of 
"countable."  However,  our  usage  has  sanction  (see  Kershner  and  Wilcox), 
and  we  know  of  no  case  where  it  is  reversed. 

Some  authors  use  the  word  "enumerable."  Most  commonly  it  is  used 
in  our  sense  of  "denumerable"  (see  Townsend,  1928),  but  occasionally  it  is 
used  in  our  sense  of  "countable." 

Theorem  XI.4.1. 

I.  \-  (a.):a  e  Den.  =  .a  sm  Nn. 
II.  \-  (a):a  e  Den.a  sm  (3.  D  .0  e  Den. 
III.  \-  (a,|S):a,/3  e  Den.  D  .a  sm  ^. 
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IV.  \-  {a):a  e  Count.  =  .a  e  Finva  e  Den. 
V.  |-  (a,l3):a  e  Count.a  sm  (3.  D  .(3  e  Count. 

Corollary  1.     |-  (a):a  e  Den.  =  .a  sm  Nn  —  {0}. 

Corollary  2.     \-  A  e  Count. 

Corollary  3.     \-  (x).{x}  e  Count. 

Corollary  4.     \-  (a):a  e  Den.  D  .a  e  Infin. 

Theorem  XI.4.2.  \-  («).■:«  CI  Nn:.(En):n  e  Nn.(x)..T  e  a  D  x  <  n.-.  D  :. 
a  e  Fin. 

Proof.  Assume  a  Q  Nn  and  (En):n  e  'Nn.(x).x  e  a  D  x  <  n.  Then 
n  e  Nn  and  a  Q  x(x  e  Nn.a;  <  n).  Then  by  Thm.XI.3.43,  Cor.  2,  and 
Thm.XI.3.31,  a  e  Fin. 

Corollary.  |-  (a)::a  C  Nn:.(En):ri  e  Nn.(x)..'c  e  a  D  x  <  n.-.  D  :. 
a  e  Fin. 

We  now  show  that  any  unbounded  set  of  integers  is  denumerable  and 
indeed  can  be  enumerated  in  the  obvious  fashion,  namely,  by  numbering 
the  members  in  increasing  order,  starting  with  the  least. 
**Theorem  XI.4.3.    \-  (aj-.-.a  Q  Nn:.(n):w  e  Nn.  3  .(Ex).x  e  a.x  >  n.-.  3  :. 
(Ei2):.Nn  sm^  a:.{'>n,n):.m,n  e  Nn:  D  -.m  <  n.  =  .R{ni)  <  R(n). 

Proof.     Assume 

(1)  a  e  Nn 

(2)  {n):n  e  Nn.  D  .(Ex).x  e  a.x  >  n. 
Now  by  Thm.XI.3.24  there  is  an  R  such  that 

(3)  R  e  Funct, 

(4)  ATg(R)  =  Nn, 

(5)  R(0)  =  Ly  (y  e  a:(z):Z  e  a.  D  .y  <  z), 

(6)  (n):n  e  Nn.  D  .R(n  +  1) 

^  Ly  (y  >  R(n):.y  e  a:.(z):z  >  R(n).z  ea.  D  .y  <  z). 

Lemma  1.     ^(0)  e  cx:{z):z  e  a.  D  .R(0)  <  z. 

Proof.  By  (2),  a  7^  A.  So  our  lemma  follows  by  (1),  (5),  and  Thm. 
XI. 3. 21,  corollary. 

Lemma  2.  {n)i:n  e  Nn.i2(w)  e  a.-.  D  :.R{n  +  1)  >  R(n):.R(n  +  1)  e  a-.. 
iz):z  >  R{n).z  e  a.  D  .R{n  +  1)  <  z. 

Proof.     Take  /S  to  be  y{y  >  R{n).y  e  a).    Then  by  (6), 

(i)  {n):n  e  Nn.  D  .R{n  +  1)  =  t?/  (y  e  ^;{z):z  e  13.  D  .y  <  z). 

Now  assume  n  e  Nn  and  R(n)  e  a.  Then  R(n)  e  Nn,  so  that  by  (2), 
/3  5^  A.    Also,  by  (1),  ^  C  Nn,  so  that  by  Thm.XI.3.21,  corollary,  and  (i) 

R(n  +  1)  e  I3:(z):z  e  13.  D  .R(n  +!)<«. 
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By  the  definition  of  /3,  our  lemma  follows. 
Lemma  3.     Val(jK)  ^  a. 

Proof.  Using  Lemma  1  and  Lemma  2,  we  can  prove  by  weak  induction 
on  n  that 

{n)'.n  e  Nn.  D  Min)  e  a. 

Now  let  y  e  Ya\(R).  Then  n/^?/.  So  n  e  Nn  by  (4).  Also  y  =  R(n)  by 
Thm.X.5.3.    Thus  ?/  e  a. 

Lemma  4.     (m,n):m,n  e  Nn.  D  ./2(m)  <  i?(m  +  n  +  1). 

Proof.  We  use  weak  induction  on  w.  By  Lemma  2,  our  lemma  holds 
Avhen  n  =  0.  Now  assume  the  lemma  for  n  and  let  n  e  Nn.  Then  i?(m)  < 
R(m  +  n  +  1).  Also  by  Lemma  2,  R{m  +  n  +  1)  <  i?(w  +  (?i  +  1)  +  1). 
Likewise,  by  Lemma  3,  R{m),  R{m  +  n  +  1),  and  R{m  +  (n  +  1)  +  1)  are 
all  in  a,  and  hence  all  in  Nn.  So  by  Thm.XL2.25,  Cor.  4,  R{m)  <R{m  + 
(n  +  1)  +  1). 

Lemma  5.     Re  1-L 

Proof.  Let  mRy.nRy.  Then  by  (4),  ?w,n  e  Nn,  and  by  Thm.X.5.3, 
R{m)  =  y  =  R(n).  By  Thm.XL3.7,  m  <  n.y.m  —  n.y.m  >  n.  If  w  <  n, 
then  by  Lemma  4  and  Thm.XL3.5,  R{m)  <  R{n),  which  is  a  contradiction. 
If  TO  >  n,  we  get  a  similar  contradiction.    So  to  =  n. 

Lemma  6.     (to):to  e  Nn.  D  .m  <  R{m). 

Proof.  Proof  by  weak  induction  on  to.  Clearly  0  <  R{0)  by  Lemma  3. 
Let  TO  <  R{m).  By  Lemma  2,  /^(to)  <  /^(7?2  +  1).  So  to  <  R{m  +  1), 
whence  to  +  1  <  R{m  +  1)  by  Thm.XI.3.5,  Cor.  1. 

Lemma  7.     a  ^  Val(i2). 

Proof.  Proof  by  reductio  ad  absurdum.  Assume '^(cn  ^  Val(i2)).  Then 
by  duality  and  rule  C,  x  e  a.^  x  e  Val(i?).  Let  7  =  ^(?/  e  a.-^  y  e  Val(i?)). 
Then  7  has  a  least  member  s,  so  that  by  Thm.XI.3.21, 

(i)  z  e  a.'^  z  e  Yal{R) 

(ii)  (y):y  e  «.'^  ?/  e  Val(/?).  0  .z  <  y. 

Then  by  (ii)  and  Thm.XI.3.7,  Cor.  2, 

(iii)  (?/):?/  €  a.y  <  z.  D  .y  e  Yal(R). 

Now  by  Lemma  6,  R(z)  >  z.  Hence  {m):m  e  n(n  e  Nn.7?(n)  <  2).  D  , 
w  <  2  by  Lemma  4.    So 

(iv)  n{n  €  Nn.i2(ri)  <  z)  e  Fin 

by  Thm.XI.4.2.  By  Lemma  1  and  (i),  R(0)  <  z.  However,  R{0)  e  Ya\(R), 
so  that  by  (i),  R{0)  9^  z.  So  R(0)  <  z.  Thus  0  e  7i{n  e  Nn.J?(n)  <  z).  So 
n{n  e  ^n.R(n)  <  z)  7^  K.  Thus,  by  (iv)  and  Thm.XI.3.44,  Cor.  4,  there 
is  a  greatest  n  in  ;i(/i  e  Nn./2(?i)  <  z).    Call  this  w;,  so  that 
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(v)  w  €  l\n.R{w)  <  z 

(vi)  (m):m  €  Nn./^(m)  <  z.  D  .m  <  w. 
Then  by  (vi) 

(vii)  R(w  -\-  1)  >  z. 

By  (v),  Lemma  2,  and  Lemma  3,  (m):R(iv)  <  m.m  ea.D  .R(w  +  1)  <  ?n. 
Then  by  (v)  and  (i),  R(w  +  1)  <  z-  Combining  with  (vii)  gives 
R{w  +  1)  =  2.  However,  R(w  +  1)  e  Val(7^).  So  ^  e  Ysd(R),  and  we  have  a 
contradiction  by  (i).    This  completes  the  proof  of  our  lemma. 

Lemma  8.     {m,n):.m,n  e  Nn:  D  -.m  <  n.  =  .R{m)  <  R(n). 

Proof.     By  Lemma  4, 

(i)  {m,n)\m,n  e  Nn,??z  <  n.  D  .i2(m)  <  /^(n). 

Now  let  m,n  e  Nn./^(m)  <  R{n).  li  m  >  n  we  get  a  contradiction  by  (i), 
and  iim  =  n  we  get  R(m)  =  R{n),  which  is  also  a  contradiction.  So  m  <  ti 
by  Thm.XL3.7. 

By  the  definition  of  Nn  sm^  a,  we  can  derive  our  theorem  from  Lemma  5, 
(4),  Lemma  3,  Lemma  7,  and  Lemma  8. 

Corollary.     [-  («)::«  ^  Nn:.(n):n  e  Nn.  D  .(Ea;)..^  e  Q;..r  >  n.:  D  :.«  e  Den. 

Theorem  XI.4.4.     [-  {a)'.a  C  Nn.  D  .a  e  Count. 

Proof.     Assume  a  C  Nn. 

Case  I.  (En):n  e  Nn.(a:).a;  e  a  ^  x  <  n.  Then  a  e  Fin  by  Thm.XL4.2. 
So  a  e  Count. 

Case  2.  ~(En):n  e  Nn.(.T)..T  t  a  D  x  <  n.  Then  by  Thm.XL3.7, 
Cor.  2,  and  duality,  (w):n  e  Nn.  D  .(Ea;).a:  e  a.a;  >  n.  Then  a  e  Den  by 
Thm.XL4.3,  corollary.    So  «  e  Count. 

Corollary.     \-  (a,l3):a  e  Den./3  ^  a.  D  ./3  e  Covmt. 

Proof.  Assume  a  e  Den./3  C  «.  Then  Nc(a:)  =  Den.  Also  Nc(/3)  < 
Nc(a).  So  p  €  NC.Nc(a)  -  Nc(/3)  +  p.  That  is,  Den  =  Nc(/3)  +  p.  So 
Nn  e  Nc(/3)  4-  p.  Then  7  e  Nc(^).5  e  p.7  n  5  =  A.7  W  5  =  Nn.  Then 
7  sm  /3.7  Q  Nn.    Then  7  e  Count  and  so  ;3  e  Count. 

Theorem  XI.4.5. 
L  \-  Den  e  NC  -  Nn. 
II.  \-  {n):n  e  Nn.  D  .n  <  Den. 

Proof.     These  are  just  Cor.  2  and  Cor.  3  of  Thm.XL3.30. 

Theorem  XI.4.6.     \-  {n):n  e  NC.n  <  Den.  D  .?i  e  Nn. 

Proo/,  Assume  n  e  NC  and  ?i  <  Den.  Then  p  e  NC.Den  =  n  -{-  p.  So 
Nn  e  w  +  p.  Then  a  e  ?r.^  e  p.a  n  /3  =  A.a  W  ^  =  Nn.  So  a  C  Nn.  Then 
a  e  Count  by  Thm.XI.4.4.  If  a  e  Den,  then  n  =  Den  by  Thm.XI.2.2, 
Part  IV,  which  would  contradict  n  <  Den.  So  '^  a  e  Den.  Hence  a  e  Fin. 
Hence  Nc(a)  e  Nn  by  Thm.XI.3.28,  Cor.  1.  But  w  =  Nc(q:)  by  Thm.XI.2.2, 
Part  III. 
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Corollary  1.     [-  (a):a  a  Fin.  =  .Nc(a:)  <  Den. 

Corollary  2.     |-  (a) -.a  e  Count.  =  .Nc(a)  <  Den. 

Corollary  3.     \-  (a,(3):a  e  Den.a  C  /3./3  e  Count.  3  ./3  e  Den. 

Proof.  From  the  hypothesis  we  get  Nc(a)  =  Den,  Nc(q!)  <  Nc(/3), 
Nc(/3)  <  Den.    Then  by  Thm.XI.2.20,  Nc(/3)  =  Den,  so  that  0  e  Den. 

Theorem  XI.4.7.  |-  (a):. a  e  Fin:  =  :(En).ri  e  Nn.a  sm  m{m  e  Nn. 
m  9^  O.m  <  n). 

Proof.  By  Thm.XI.3.43,  Cor.  1,  and  Ex.XI.3.3,  we  easily  go  from  right 
to  left.  Now  assume  a  e  Fin.  Then  Nc(a)  e  Nn.  Then  by  Thms.XI.2.44 
and  XI.2.42,  USC()8)  e  Nc(«).  So  USC(/3)  sm  a  by  Thm.XI.2.1.  Then 
/3  €  Fin  by  Thm.XI.3.28,  Cor.  6,  and  Thm.XI.3.39.  Repeating  for  /3  the 
reasoning  we  carried  out  for  a  gives  us  USC(7)  sm  /3.  Then  by  Thm. 
XI.1.33,  USC'Ct)  sm  a.  Then  by  Thm.XI.3.43,  a  sm  x{x  e  Nn.a:  <  NcCt)). 
We  conclude  the  theorem  by  use  of  Ex.XI.3.3. 

Corollary.     |-  («):.«  e  Count:  =  :(E/3).^  C  Nn.a  sm  /3. 

Proof.  We  go  from  right  to  left  by  Thm.XI.4.4.  Conversely,  let 
a  e  Count.  If  a  e  Fin,  we  use  the  theorem  just  proved.  If  a  e  Den,  take  ^ 
to  be  Nn. 

Theorem  XI.4.8.     ^  {K):R  e  Funct.Arg(i2)  C  Nn.  D  .Val(i2)  e  Count. 

Proof.     Assume 

(1)  R  t  Funct 

(2)  Arg(/2)  C  Nn. 
Let 

(3)  S  =  xy{xRy:(z).zRy  D  x  <  z). 
Obviously 

(4)  S  e  Funct. 

Let  uSy.vSy.    Then  by  (2),  u.v  e  Nn.    Also  by  (3) 

{z).zRy  D  u  <  z 

(z).zRy  D  V  <  z. 

Taking  z  to  be  y  in  the  first  of  these,  and  u  in  the  second  gives  u  <  v  and 
V  <  u,  so  that  u  =  V.    So 

(5)  S  €  1-1. 
Obviously 

(6)  Val(^)  Q  Ya\{R). 

Now  let  y  e  Ya\(R).     Then  uRy.     Hence  x{xRy)   ^  A.     Also  by  (2), 
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x{xRy)  C  Nn.    So  by  Thm.XI.3.21,  there  is  a  least  x  in  x{xRy),  so  that 
xRy:(z).zRy  D  x  <  z.    That  is  a;*S2/.    Thus  ?/  e  Val(*S).    So  by  (6) 

(7)  Val(^)  =  Ya\(R). 

By  (5),  (7),  and  Thm.XI.l.l,  Part  IV,  Arg(*S)  sm  Ya\(R).  But  by  (3) 
and  (2),  Arg(S)  C  Nn.  So  by  Thm.XI.4.4,  Arg(5)  c  Count.  Hence 
Ysd(R)  e  Count. 

Theorem  XI.4.9.     \-  Den  X  Den  =  Den. 

Proof.     By  Thm.XI.1.20, 

(1)  h(Nn  X  {0})  eDen. 
By  Thm.X.3.12,  Part  II, 

(2)  [-(Nn  X  {00  C  (Nn  X  Nn). 
Let 

(3)  R  =  ui)({Ex,y).x,y  e  Nn.w  =  ((x  +  y)  X  {x  +  y  -\-  1))  +  x. 

V  =  {x,y)). 

Clearly 

(4)  h  Arg(i2)  C  Nn. 

(5)  yR  e  Rel. 

Let  uRv.uRw.  Then  a:,?/  e  Nn.w  =  {{x  -\-  y)  X  {x  -\-  y  +  1))  +  a;, 
y  =  {x,y)  and  a;',?/'  e  Nn.w  =  {{x'  +  ?/')  X  {x'  +  y'  +  1))  +  y'.w  =  (x',y'). 
Write  temporarily  z  =  x  -\-  y,  z'  ^  x'  -{-  y'.  Then  (2  X  (s  +  1))  +  a:  = 
w  =  (s'  X  (2'  +  1))  +  x'.    By  Thm.XI.3.7,  z  <  z'.y.z  =  z'.y.z  >  z' . 

Case  I.     z  <  z'.    Then  by  Thm.XI.3.5,  n  e  Nn.2'  =  z  +  n  -\-  I.    Then 

2'  X  (s'  +  1)  =  (2  +  n  +  1)  X  ((2  +  n  +  1)  +  1) 

=  (2  +  (n  +  1))  X  {{z  +  1)  +  (n  +  1)) 

=  (2  X  (2  +  1))  +  (2  X  (n  +  1)) 

+  ((n  +  1)  X  (2  +  1))  +  {in  +  1)  X  (n  +  1)) 

=  (2  X  (2  +1))  +  (2  X  (w  +  n  +  2))  +  ((n  +  2)  X  (n  +  1)). 

By  Thm.XI.2.35,  2  X  (n  +  n  +  2)  >  2  X  1,  so  that  2  X  (n  +  n  +  2)  >  2. 
Also 

(n  +  2)  X  (n  +  1)  =  (n  X  n)  +  (n  X  3)  +  2, 

so  that  (n  +  2)  X  (n  +  1)  >  2.    Then  by  Thm.XI.2.25,  Cor.  1, 

z'  X  (2'  +  1)  >  (2  X  (2  +  1))  +  2  +  2 

>  (2  X  (2  +  1))  +  2. 
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Now,  since  z  =  x  -{-  y,  we  have  z  >  x  by  Thm.XI.2.19,  corollary.  So 
z'  X  {z'  +  1)  >  (zX  (z-\-  1))  +  X.  Then  (z'  X  (z'  +  1))  +  x'  >  (z  X 
(z  +  1))  +  X.    This  contradicts  (z  X  (z  +  1))  -\-  x  =  {z'  X  (z'  +  1))  +  x'. 

Case  2.     z  >  z' .    In  this  case  we  get  a  similar  contradiction. 

So  3  =  z'.  Then  z  X  (s  +  1)  =  0'  X  (^^  +  1).  So  from  (3  X  (2  +  1))  + 
a:  =  (2'  X  {z!  +  1))  +  a;'  we  get  x  =  x'  hy  Thm.XI.3.2.  Since  z  =  x  -\-  y 
and  z'  ^  x'  -\-  y',  we  have  x  -\-  y  =  x'  -\-  y',  whence  y  =  y'  hy  Thm.XI.3.2, 
since  x  =  x' .    Then  v  =  {x,y)  =  {x',y')  —  w.    So 

(6)  \-R  e  Funct. 

By  (4),  (6),  and  Thm.XI.4.8,  |-  YsdiR)  e  Count.  However,  obviously 
\-ya\{R)  =  (Nn  X  Nn).    So 

(7)  h  (Nn  X  Nn)  e  Count. 

Now  by  (1),  (2),  (7),  and  Thm.XI.4.6,  Cor.  3,  we  get  \-  (Nn  X  Nn)  e  Den. 
Then 

\-  Nc(Nn  X  Nn)  =  Den. 
So  by  Thm.XI.2.27 

\-  Den  X  Den  =  Den.    -  " 

Our  proof  of  this  is  essentially  the  same  as  the  familiar  intuitive  proof, 
which  proceeds  by  arranging  the  members  of  Nn  X  Nn  in  a  sequence  as 
follows:  (0,0),  <0,1),  (1,0),  (0,2),  (1,1),  (2,0),  (0,3),  (1,2),  (2,1),  (3,0),  (0,4),  .... 
In  this  sequence,  pairs  {x,y)  are  put  ahead  of  pairs  {x',y')  ii  x  -{-  y  <  x'  -\-  y'. 

Theorem  XI.4.10.     \-  {m):m  e  Nn.w  ?^  0.  D  .m  X  Den  ==  Den. 

Proof.  Let  m  e  Nn.w  5^  0.  Then  n  e  Nn.m  =  n  +  1.  So  1  <  w.  Then 
by  Thm.XI.2.35 

(1)  (1  X  Den)  <  (w  X  Den). 
Also  by  Thm.XI.4.5,  Part  II,  m  <  Den.    So 

(2)  (w  X  Den)  <  (Den  X  Den). 

However,  \-  Den  =   (1  X  Den)  and  \~  (Den  X  Den)  =  Den.     So  by 
Thm.XI.2.20  and  (1)  and  (2),  our  theorem  follows. 
Corollary  1.     \- 2  X  Den  -  Den. 
Corollary  2.     |-  Den  +  Den  =  Den. 
Theorem  XI.4.11.     |-  {'m):m  e  Nn.  D  .w  +  Den  =  Den. 
Proof.     We  have  \- Q  <  m  <  Den.    Hence,  by  Cor.  2  of  Thm.XI.4.10, 

\-  Den  =  0  +  Den  <  m  +  Den  <  Den  +  Den  =  Den. 

**Theorem  XI.4.12.     \-  Can(Nn). 
Proof.     Put 
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(1)       R  =  xz(x  e  Nn:(EQ;,/3,y).Q:  e  x./S  e  t/.t/  e  Nn.2;  =    {yl.a  sm  USC(/3)). 

Lemma  1.     p  72  e  1-1. 

Proof.  Clearly  R  e  Rel.  Now  let  xRz.xRz'.  Then  a:  c  Nn,  and  a  e  x. 
/S  e  ?/.?/  €  Nn.s  =  {y].a  sm  USC(i8),  and  a'  e  a:./3'  e  ?/'.?/'  e  Nn.^'  =  {|/'|. 
«'  sm  USC(/3').  Then  by  Thm.XI.2.3,  Part  I,  a  sm  a'.  So  USC(/3)  sm 
USC(/3')-  Then  /3  sm  /3'  by  Thm.XI.1.33.  So  Nc(i8)  =  Nc(i80  by  Thm. 
XI.2.2,  Part  II.  Also  ?/  =  Nc(/3)  and  ^'  =  Nc(/3')  by  Thm.XI.2.2,  Part  III. 
Thus  2/  =  y'.    Then  2  =  {y\  =  {y'\   =  z' .    Consequently 

y  R  €  Funct. 
Analogously,  we  show 

y  R  e  Funct. 

Lemma  2.     \-  Arg(R)  =  Nn. 

Proof.  It  is  obvious  that  |-  Arg(^)  C  Nn.  Let  x  e  Nn.  Then  a  e  x  by 
Thm.XI.2.2,  Part  VI.  Also  USC(/3)  e  x  by  Thm.XI.2.44  and  Thm.XI.2.42. 
Then  a  sm  USC(/3)  by  Thm.XI.2.3,  Part  I.  Also  USC(j8)  e  Fin,  since 
USC(/3)  €  x.x  e  Nn.  So  (3  e  Fin  by  Thm.XL3.39.  Thus  I3ey.ye  Nn.  Hence 
xR{y}.    So  re  e  Arg(R). 

Lemma  3.     |-  Val(/2)  =  USC(Nn). 

Proof.  It  is  obvious  that  \-  Yal(R)  C  USC(Nn).  Let  z  e  USC(Nn). 
Then  z  =  {y}.y  e'^n.  So  (3  e  y.  Then  ^S  e  Fin,  so  that  USC(/3)  e  Fin  by 
Thm.XI.3.38.  Accordingly,  USC(/3)  e  a;.a;  e  Nn.  Then  xRz,  so  that 
2  e  Val(i2). 

Our  theorem  now  follows  by  (1),  (2),  (3),  Thm.XI.1.1,  Part  IV,  and  the 
definition  of  Can(Nn). 

Corollary  1.     \-  (Den)°  ^  A. 

Proof.     Use  Thm.XI.2.62,  corollary. 

Corollary  2.     \-  (Den)°  e  NC. 

Proof.     Use  Thm.XI.2.48,  Cor.  1. 

Corollary  3.     h  (Den)°  =  1. 

Corollary  4.     h  0""'°  =  0. 

Corollary  5.     \-  (Den)'  =  Den. 

Corollary  6.     h  1°"  =  1- 

Corollary  7.     h  (Den)^  =  Den  X  Den. 
,    Corollary  8.     |-  2°^"  =  Nc(SC(Nn)). 

Corollary  9.     h  Den  <  2°'°. 

Corollary  10.     \-  Can(SC(Nn)). 

Proof.     Use  Thm.XI.1.37. 

Corollary  11.     \-  {a):a  e  Den.  =  .USC(a)  e  Den. 

Corollary  12.     h  («):«  «  Count.  =  .USC(a)  e  Count. 

Corollary  13.     ^  («)=«  «  Den.  D  .Can(a:). 

Proof.     Use  Thm.XI.2.61. 
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We  have  remarked  earlier  that,  although  we  cannot  prove  Can  (a)  for 
general  classes  a,  it  is  a  property  which  seems  intuitively  obvious.  Hence 
we  should  be  able  to  prove  Can  (a)  for  the  familiar  classes  which  appear  in 
mathematics.  We  have  now  done  so  for  the  familiar  class  Nn,  and  thus  for 
all  denumerable  classes. 

Theorem  XI.4.13.     \-  (m):m  e  Nn.w  9^  0.  D  .(Den)'"  =  Den. 

Proof.     We  prove  by  weak  induction  on  n  that 

h  {7i):n  e  Nn.  D  .(Den)""'  =  Den. 

If  n  =  0,  we  use  Thm.XI.4.12,  Cor.  5.    Assume  (Den)""'  =  Den.    Then 
by  Thm.XI.4.12,  Cor.  2,  Thm.XI.3.12,  and  Thm.XI.2.54, 

(Den)'""'^"'  =  (Den)'""''  X  (Den)' 
=  Den  X  Den 
=  Den 
by  Thm.XI.4.9. 

Theorem  XI.4.14. 
I.  \-  (a,/3):a,/3  e  Den.  D  .a  U  /3  e  Den. 
II.  \-  (a,l3):a,l3  e  Count.  D  .a  \J  ^  e  Count. 

Proof  of  Part  I.  Assume  a,^  e  Den.  Then  Nc(a)  =  Den.  Also 
jS  -  a  C  /3.  So  by  Thm.XI.4.4,  corollary,  /3  -  a  e  Count.  Then  by 
Thm.XI.4.6,  Cor.  2,  Nc(^  -  a)  <  Den.  Now  by  Ex.XI.2.8,  Nc(q:  W  ^)  = 
Nc(a)  +  Nc(/3  -  a).    But  by  Thm.XI.2.23, 

Nc(a)  +  Nc(/3  -  a)  =  Den  +  Nc(/3  -  a) 
<  Den  +  Den. 

So  by  Thm.XI.4.10,  Cor.  2, 

(1)  Nc(a  W  /3)  <  Den. 

Now  Den  =  Nc(a)  <  Nc(a)  +  Nc(/3  -  a)  =  Nc(q!  KJ  0).    So 

(2)  Den  <  Nc(q;  W  ^3). 

So  Nc(q;  \J  0)  =  Den,  and  a\J  fi  t  Den. 

Proof  of  Part  II.     Similar. 

A  generalized  form  of  this  says  that  if  one  has  any  nonnull  finite  class  of 
denumerable  classes,  then  the  totality  of  all  their  elements  is  likewise 
denumerable.    Similarly  for  countable  classes. 

Theorem  XI.4.15. 
I.  \-  (X):X  5^  A.X  €  Fin.X  C  Den.  D  .IJX  e  Den. 
II.  \-  (X):X  c  Fin.X  C  Count.  D  .\J\  e  Count. 

Proof  of  Part  I.  We  proceed  analogously  to  the  proof  of  Thm.XI.3.33 
to  prove  |-  (\,n):n  e  Nn.X  e  n  +  l.X  ^  Den.  D  .\JX  e  Den. 

Proof  of  Part  II.     Similar. 
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One  hears  widely  quoted  an  even  stronger  result  to  the  effect  that,  if  one 
has  a  denumerable  class  of  denumerable  classes,  then  the  totality  of  all 
their  elements  is  likewise  denumerable.  The  intuitive  proof  proceeds  as 
follows.  Let  X  be  the  class,  and  ao,  a,,  aa,  .  .  .  its  members.  Now  let  the 
members  of  a„  hexo.n,  Xi,^,  X2,n,  •  ■  •  •  By  pairing  a;„,„  with  {m,n),  we  estab- 
lish a  one-to-one  correspondence  between  (JX  and  Nn  X  Nn.  Then  by 
Thm.XI.4.9,  \J\  e  Den. 

Unfortunately,  the  proof  just  given  involves  the  axiom  of  choice.  Indeed, 
we  know  of  no  proof  of  this  result  which  does  not  involve  the  axiom  of 
choice.  Hence  we  must  defer  a  proof  of  this  result  until  Chapter  XIV. 
Meanwhile  we  now  prove  the  nearest  equivalents  that  we  know  how  to 
prove  with  the  axioms  now  at  our  disposal. 

Theorem  XI.4.16.     Let  R„  be  a  term,  containing  free  occurrences  of  n, 
which  is  stratified  and  has  type  one  higher  than  the  type  of  n.    Then: 
I.  h  (n):n  e  Nn.  D  .Rn  e  Funct.Arg(«„)  Q  Nn.:  D  :.\J  { Val(/2„)  |  n  e  Nn}  € 

Count. 
II.  \-  in):n  e  Nn.  D  Mn  e  Funct.Arg(jR„)  Q  Nn:.(En).n  e  Nn.Val(ie„)  e  Den:: 
D  ::U{Val(i2„)  |  n  6  Nn}  c  Den. 

Proof  of  I.     Assume 

(1)  {n):n  e  Nn.  D  Mn  €  Funct.Arg(/2„)  e  Nn. 
By  Thm.XI.4.9,  there  is  an  S  such  that 

(2)  S  e  l-l.Arg(AS)  =  Nn.Val(^)  =  Nn  X  Nn. 
Now  define  W  as 

(3)  W  =  xy{x  e  l<in:(Em,n) .xS{m,n).m  e  Arg(R„).y  —  R^{m)). 
Clearly 

(4)  [-  Arg(Tr)  C  Nn. 

Let  xWy.xWz.  Then  xS(m,n).y  =  Rnim)  and  xS{u,v).z  =  R,(u).  So 
{m,n)  =  {u,v)  by  (2).    So  m  =  m  and  n  —  v.    So  y  =  z.    Thus 

(5)  W  e  Funct. 
Then  by  Thm.XI.4.8 

(6)  Yb1(W)  e  Count. 

Since  Ysd(S)  =  Nn  X  Nn  by  (2),  we  readily  deduce  by  (3)  that 

(7)  Val(T7)  C  U{Val(/2„)  |  n  e  Nn}. 

Conversely,  let  y  e  \J{Ya\{Rn)   \  n  e  Nn}.     Then  by  rule  C,  y  e  a. 

a.  6  {Val(/2„)  I  n  e  Nn}.     By  rule  C  again,  y  e  a.n  e  Nn.a  =  Val(i?„).    So 
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n  e  Nn.2/  e  Val(i2„).  Thus  n  e  Nn.m  e  Arg(J?„).m(/^„)?/.  Then  by  (1), 
m,n  e  Nn.m  6  Arg(7?„).?/  =  Rn(m).  By  (2),  there  is  an  a:  such  that  x  e  Nn. 
a;iS<m,n).    Then  xWy,  and  ?/  e  Val(Pr).    Thus,  by  (7), 

Val(TF)  =  U{Val(/2„)  |  n  e  Nn}. 

Then  our  theorem  follows  by  (6). 
Proof  of  11.     Assume 

(1)  (n):n  €  Nn.  D  .Rn  e  Funct.Arg(7?„)  Q  Nn, 

(2)  (En).n  e  Nn.Val(72„)  e  Den. 
By  (1)  and  Part  I, 

(3)  U{Val(i?„)  I  n  e  Nn}  e  Count. 
However,  we  easily  show 

{n):n  e  Nn.  D  .Ya\(R„)  Q  U{Val(/^„)  |  n  e  Nn}. 

So  by  (2), 

Yal(Rn)  €  Den:Val(i?„)  C  U{Val(/2J  |  n  c  Nn}. 

Then  by  (3),  our  theorem  follows  from  Thm.XI.4.6,  Cor.  3. 
Theorem  XI.4.17.     Let  Rn  be  a  term  which  is  stratified.    Then: 
I.  \-  {n):n  e  Nn.  D  .Rn  e  Funct.Arg(i2„)  C  Nn.:  D  :.\J  { Val(7^„)  |  n  e  Nn}  e 

Count. 
II.  h  (n):n  €  Nn.  D  .Rn  e  Funct.Arg(i2„)  C  Nn:.(En).n  e  Nn.Val(i2„)  e  Den:: 
3  ::U{Val(i^„)  I  r^  eNn}  e  Den. 
Proof.  We  shall  illustrate  the  procedure  which  would  be  used  in  case 
Rn  contains  free  occurrences  of  n  and  has  type  one  lower  than  n.  It  will 
be  clear  that  the  proof  for  any  other  type  difference  between  Rn  and  n 
would  be  similar,  as  would  the  case  where  Rn  contains  no  free  occurrences 
of  n.    By  Thm.XI.4.12,  there  are  U  and  W  such  that 

(1)  U  €  l-l.Arg(f7)  =  Nn.Val(f/)  =  USC(Nn). 

(2)  We  l-l.Arg(IF)  =  Nn.Val(IF)  -  USC(Nn). 
Define  /S„  so  that 

(3)  Sn  =  xiX(Eu,v).x(R:)y.uU{v}.vW{n]). 

Then  >S„  satisfies  the  conditions  set  on  Rn  in  the  hypothesis  of  Thm. 
XI. 4. 16.    Also,  by  (1),  (2),  and  (3)  one  easily  shows  that 

U{Val(/2„)  I  n  e  Nn}  =  U{Val(/S„)  |  ti  e  Nn}. 

We  now  prove  that  the  set  of  all  finite  subsets  of  Nn  is  denumerable. 
The  usual  intuitive  proof  is  to  show  that  there  is  a  single  null  subset,  a 
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denumerable  number  of  unit  subsets,  a  denumerable  number  of  subsets 
with  two  members,  ....  By  combining  all  these,  we  get  all  finite  subsets 
as  the  null  set  plus  the  totality  of  members  of  a  denumerable  set  of  denumer- 
able sets,  which  is  then  denumerable  by  a  well-known  theorem.  As  we 
said  above,  we  shall  not  have  this  well-known  theorem  until  Chapter  XIV, 
after  assuming  the  axiom  of  choice.  However,  by  modifying  the  reasoning 
a  bit,  we  can  manage  by  use  of  the  weaker  Thm.XI.4.17. 

Theorem  XI.4.18.     \-  (SC(Nn)  n  Fin)  e  Den. 

Proof.     By  Thm.XI.4.9  there  is  an  S  such  that 

(1)  S  e  l-l.Arg(^)  =  Nn.Val(*S)  =  Nn  X  Nn. 
By  Thm.XI.4.12  there  is  a  T  such  that 

(2)  T  e  l-l.Arg(T)  =  Nn.Val(r)  =  USC(Nn). 
By  Thm.XI.3.24,  there  is  a  JV  such  that 

(3)  W  €  Funct, 

(4)  Aig(W)  =  Nn, 

(5)  W{0)  =  T, 

(6)  (n):n  e  Nn.  D  .W(n  -f  1)  =  xy(x  e  Nn: 

{Er,s).xS{r,s).y  =  (W{n)){r)  \J  T(s)). 

It  will  turn  out  that,  in  effect,  W{n)  is  a  function  which  enumerates  all 
nonnull  subsets  of  Nn  with  n  -\-  1  or  fewer  members. 

Lemma  1.     (n)in  e  Nn.  D  .W(n)  e  Funct. Arg(lF(w))  =  Nn. 

Proof.  Proof  by  weak  induction  on  n.  Clearly  the  result  holds  for 
n  =  0  by  (5).    Assume  the  result  for  n,  and  let  n  e  Nn.    By  (6)  and  (1), 

(i)  Arg(TF(n  4-  1))  -  Nn. 

Now  let  x{Win  +  l))y.x(W(n  +  l))z.  Then  x  e  Nn,  xS{r,s).y  = 
iW(n))(r)  W  T(s),  and  xS(r',s').z  -  iW(n)){r')  W  T(s').  So  by  (1), 
{r,s)  —  {r',s').    So  r  =  r',  s  =  s'.    Then  y  ^  z.    So 

(ii)  W{n  -\-  I)  t  Funct. 

Levima  2.     Val(IF(0))  e  Den. 

Proof.     Use  (5),  (2),  and  Thm.XI.4.12. 

Lemma  3.     0  W  uiVal(T'F(n))  |  n  e  Nn}  e  Den. 

Proof.  Temporarily  let  a  denote  U{Val(TF(n))  |  n  e  Nn}.  By  Thm. 
XI.4.17,  Part  II,  a  e  Den. 

Case  1.     A  €  a.    Then  by  Thm.IX.6.5,  Cor.  2,  0  W  a  e  Den. 

Case  2.  ~  A  e  a.  Then  by  Thm.IX.6.6,  Cor.  1,  0  W  a  e  1  -h  Den. 
So  by  Thm.XI.4.11,  0  W  a  e  Den. 
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Lemma  4.     (w):n  e  Nn.  D  .Val(Tr(n))  C  SC(Nn)  n  Fin. 

Proo/.  Proof  by  weak  induction  on  n.  By  (5),  (2),  and  Thm.IX.6.18, 
Cor.  1,  Val(TF(0))  Q  SC(Nn).  Also  by  (5),  (2),  and  Thm.XI.3.28,  Cor.  4, 
Val(TF(0))  C  Fin.  So  by  Thm.IX.4.18,  Part  I,  our  lemma  holds  when 
n  =  0.  So  assume  n  e  Nn  and  Val(TF(?i))  C  SC(Nn)  n  Fin.  If  y  e 
Val(TF(n  +  1)),  then  by  (6),  (5),  (1),  and  Lemma  1,  r  e  ATg{W(n)), 
s  e  Arg(Tr(0)),  y  =  (W(n))(r)  W  (W(0))(s).  So  by  Lemma  1,  {W{n)){r)  e 
Va\(W{n))  and  (TF(0))(s)  e  Val(TF(0)).  So  by  our  lemma  for  0  and  n, 
{W{n))(r),{W(0))(s)  e  SC(Nn)  n  Fin.  Then  by  Thm.XL3.32,  y  e 
SC(Nn)  n  Fin. 

Lemma  5.     0  W  UlVal(TF(ri))  |  n  e  Nnj  C  SC(Nn)  n  Fin. 

Proof.  Use  Thm.XL3.28,  Cor.  3,  to  get  0  Q  SC(Nn)  n  Fin.  Use 
Thm.IX.5.8,  Part  II,  and  Lemma  4  to  get  U{Val(TF(n))  |  n  e  Nn}  C 
SC(Nn)  r\  Fin. 

Lemma  6.  (a,m):m  e  Nn.o:  e  m  +  l.a  ^  Nn.  D  .(En).n  e  Nn.a  e 
Val(Tr(n)). 

Proo/.  Proof  by  weak  induction  on  m.  First  let  a  e  0  +  l.a  ^  Nn. 
Then  a  e  USC(Nn).  So  by  (2)  and  (5),  a  e  Val(TF(0)).  So  our  lemma  holds 
for  m  =  0.  Now  assume  the  lemma  for  m  and  let  m  e  Nn,  a  e  (m  +  1)  +  1, 
a  Q  Nn.  Then  /3  e  w  +  l.'-^  .r  e  /3./3  W  {a;}  =  a.  Then  ^  Q  a,  so  that 
/3  C  Nn.  So  by  our  lemma,  ?i  e  Nn.)8  e  Val(Pr(n)).  So  r  e  Nn./3  =  (T^(n))(r) 
by  Lemma  L  Also  x  e  a,  so  that  x  e  Nn.  Then  by  (2),  s  t  Nn. {a:}  =  ^(s). 
Finally  by  (1),  z  e  ^n.zS(r,s).  So  z  e  ^n.zS{r,s).a  =  iW{n))(r)  W  r(s). 
So  z{W{n  +  l))a.    Thus  a  e  Val(TF(n  +  1)). 

Lemma  7.     SC(Nn)  n  Fin  C  0  W  \J{Va\{W{n))  |  n  e  Nn}. 

Proof.     Let  a  e  SC(Nn)  n  Fin.    Then  a  C  Nn  and  p  e  Nn.a  €  p. 

Case  1.     p  =  0.    Then  a  e  0,  so  that 

aeO\J  U{Val(TF(n))  |  neNn}. 

Case  2.     p  9^  0.    Then  m  c  Nn.p  =  w  +  1.    Then  by  Lemma  6,  n  c  Nn. 

a  £  Val(Tr(n)).    So  (E^).a  e  /3./3  =  Val(lF(ri)).n  e  Nn.    Thus  (E/3).a  6  /3. 
/3  a  {Val(Tr(n))  |  w  e  Nn}.    Then  a  e  U{Val(TF(n))  |  n  e  Nn}.    Thus 

a  eO  W  \J{Ya\(W(n))  ]  neNn}. 

Our  theorem  now  follows  by  Lemma  3,  Lemma  5,  and  Lemma  7. 

EXERCISES 

XL4.1.     Prove  ^  (R):R  e  Funct.Arg(P)  e  Count.  D  .Val(P)  e  Count. 

XI.4.2.  Formalize  the  following  intuitive  proof  that  the  positive  rational 
numbers  are  denumerable.  In  particular,  show  what  has  to  be  done  if  the 
positive  rational  number 
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m  -f  1 
n  +  1 

has  type  one  higher  than  the  types  of  m  and  n.    {Hint.    Use  Thm.XI.4. 12.) 
Proof.     Given  any  ordered  pair  {m,n)  with  m,n  e  Nn,  we  can  obtain  a 
unique  positive  rational  number,  namely, 

m  +  1 
n  +   1  ' 

and  every  positive  rational  number  can  be  so  obtained.  Thus  there  is  a 
function  R  with  Arg(/2)  =  Nn  X  Nn  and  Val(/2)  =  the  set  of  positive 
rational  numbers.  Then  by  Ex.XI.4.1,  the  positive  rational  numbers  are 
countable.  However,  there  is  a  denumerable  subset  of  the  positive  rational 
numbers. 

XI.4.3.  Using  the  fact  that  the  positive  rational  numbers  are  denumer- 
able, prove  that  all  the  rational  numbers  are  denumerable. 

We  refer  to  a  point  {x,y)  in  the  plane  as  a  rational  point  if  both  its 
coordinates,  x  and  y,  are  rational  numbers. 

XI.4.4.  Prove  that  there  is  a  denumerable  number  of  rational  points 
in  the  plane. 

XI.4.5.  Prove  that  the  set  of  circles  with  a  rational  radius  and  a  rational 
center  is  denumerable. 

XI.4,6.     Prove: 

(a)  \-  (a,^):a  C  ^.^  e  Count.  D  .a  e  Count. 

(b)  \-  (a,iS):a,iS  e  Den.  D  .a  X  13  e  Den. 

(c)  h  Nn  X  USC(Nn)  e  Den. 

XI.4.7.  A  number  is  said  to  be  even  if  it  is  divisible  by  2.  That  is, 
we  put 

Even        for        m{(En).n  e  Nn.m  =  2  X  n). 

Prove  |-  Even  e  Den. 

XI.4.8.     Prove  [-  m((En).n  e  Nn.m  =  n^)  e  Den. 

XI.4.9.     Prove  f-  Den  <  Nc(SC(Nn)  n  Infin). 

XI.4.10.     Prove  |-  (SC(Nn)  n  Infin)  sm  SC(Nn). 

XI.4.11.     Prove  [-  2''''"  <  Nc(V). 

XI.4.12.  Prove  |-  {R):R  e  RelR  e  Count.  D  .(ES).S  e  Funct.Arg(/?)  = 
A.TgiS).S  C  R. 

XI.4.13.  Prove  |-  (n):.n  e  NC:  D  :Den  <  n.  =  .n  =  Den  +  n.  (Hint. 
Use  Thm.XI.4.10,  Cor.2.) 

XI.4.14.  Prove  h  (a):.Den  <  Nc(«):  D  :(E/3).,8  C  «./?  sm  a.  (Hint. 
Use  Ex.XI.4.13.) 

XI.4.15.     Prove  \-  («):.(E)3)./3  C  «./3  sm  a-.   D   :Den  <  Nc(a).     (Hint 
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Let  a  smjj  /3.^  C  «•  Choose  oj  in  a  -  /3.  Define  /  by  /(O)  =  x,f{n-\-  \)  = 
RU{x)).    Then  /  e  1-1  and  Val(/)  C  «.) 

XI.4.16.  Prove  that  every  open  set  in  the  plane  is  the  sum  of  a  countable 
number  of  interiors  of  circles.  {Hint.  Prove  that  the  interiors  of  circles 
with  rational  radii  and  rational  centers  constitute  a  set  of  neighborhoods 
for  the  plane.    Then  use  Ex.XI.4.5  and  Thm.IX.8.11.) 

This  result  is  called  the  Lindelof  theorem. 

5.  The  Cardinal  Number  of  the  Continuum.  We  now  consider  the 
cardinal  number  of  all  real  numbers.  Since  the  real  numbers  can  be  put 
into  one-to-one  correspondence  with  the  points  of  a  line,  the  cardinal  num- 
ber of  all  real  numbers  is  also  the  cardinal  number  of  all  points  on  a  line. 
Hence  this  cardinal  number  is  commonly  called  the  cardinal  number  of  the 
continuum. 

The  cardinal  number  of  the  continuum  is  also  the  cardinal  number  of  all 
positive  real  numbers  less  than  or  equal  to  unity.  In  turn,  one  can  set  up  a 
one-to-one  correspondence  between  the  real  numbers  x  with  0  <  x  <  1  and 
the  nonterminating  binary  expansions  with  no  digits  to  the  left  of  the 
binary  point.  The  proof  of  this  is  sketched  on  pages  408  to  409  of  this 
text.  Accordingly,  one  can  define  the  cardinal  number  of  the  continuum 
as  the  cardinal  number  of  all  nonterminating  binary  expansions  with  no 
digits  to  the  left  of  the  binary  point.    We  do  so. 

Each  such  binary  expansion  determines  a  function  R  with  Arg(i2)  = 
Nn  —  {0}  and  Val(i2)  C  {0,1},  since  one  merely  defines  R{n)  to  be  the 
nth  digit  in  the  expansion.  Conversely,  given  a  function  R  with  Arg(i?)  = 
Nn  —  {0}  and  Val(i2)  ^  {0,1 } ,  we  can  define  a  binary  expansion  by  taking 
R{n)  to  be  the  nth  digit.  So  we  identify  binary  expansions  with  functions 
R  such  that  Arg(72)  =  Nn  -  {0|  and  Yq\{R)  Q  {0,1}.  Such  a  binary 
expansion  is  nonterminating  if  (n):n  e  Nn.  D  .(Em).m  e  Nn.m  >  n. 
R{m)  =  1.    Accordingly,  we  define 

PI  for         Nn  -  {0}, 

NTBX     .    for         R{R  e  Funct.Arg(i2)   =  Pl.Val(i^)  C  {0,l}:(n): 

n  e  Nn.  D  .(Em),m  e  Nn.m  >  n.R{ni)  =  1), 
c  for        Nc(NTBX). 

Then  PI  is  the  class  of  positive  integers,  NTBX  is  the  class  of  non- 
terminating  binary  expansions,  and  c  is  the  cardinal  number  of  the  contin- 
uum. Each  of  PI,  NTBX,  and  c  is  stratified,  and  contains  no  free  variables, 
and  so  may  be  assigned  any  type. 

Some  writers  use  K  for  c. 

Theorem  XI.5.1. 

I.  |-  (n):n  e  PI.  =  .n  e  Nn.n  5^  0. 
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II.  \-  (n):n  e  PI.  =  .(Em).m  e  Nn.n  —  m  -\-  1. 
III.  h  PI  e  Den. 

Proof  of  Part  III.     Use  Thm.XI.3.30. 

Theorem  XL5.2.  \-  {R)::R  e  NTBX.:  =  -..R  e  Funct.Arg(/^)  =  PI. 
Val(jR)  Q  {0,l}:(n):n  e  Nn.  D  .(Em).m  e  Nn.m  >  n.i2(m)  =  1. 

Theorem  XI.5.3. 

I.  [-(«):«  e  c.  =  .a  sm  NTBX. 
II.  1"  {a,l3):a  e  Ca  sm  (3.  D  .0  e  C. 
III.   [-  (a,/3):a,/3  e  c.  D  .a  sm  /3. 

Theorem  XI.5.4.     \-  Den  <  c. 

Proof.     Put 

(1)  Rn  =  ^^(a^  e  PI:a;  <  n.?/  =  O.v.a:  >  n.t/  =  1). 

(2)  a  =  {Rn\n  eVl]. 
Clearly 

(3)  [-  {n):n  e  PI.  D  .P„  e  NTBX. 

(4)  h  «  ^  NTBX. 
Put 

(5)  S  =  xyi(Ez).z  e  FI.x  =   {z}.y  =  R,). 
Then  clearly 

(6)  \-S  €  Funct. 

(7)  h  Arg(5)  =  USC(PI). 

(8)  h  Val(>S)  =  a. 

Let  uSy.vSy.  Then  2  e  PI.w  =  {2}.!/  =  R^  and  «;  e  Pl.y  =  {w}.y  =  R^. 
Then  R,  =  R^  and  2  <  w.y.z  =  w.y.z  >  iv.  li  z  <  w,  then  (z,0)  e  R^  but 
~  {z,0)  eR.hy  (1),  contradicting  R,  =  R,,.  Similarly,  we  have  a  contradic- 
tion \i  z  >  w.    So  2;  =  «;.    So  w  =  ?;.    Thus  by  (6) 

(9)  ySt  1-1. 
Then  by  (9),  (7),  and  (8), 

(10)  \-  USC(PI)  sm  a. 
However,  by  Thm.XI.3.30  and  Thm.XI.1.33, 

(11)  h  USC(Nn)  sm  USC(PI). 
Also,  byThm.XI.4.12, 

(12)  h  Nn  sm  USC(Nn). 
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So  by  (10),  (11),  and  (12), 

(13)  h  Nc(a)  =  Den. 
Also  by  Thm.XI.2.13,  Part  I,  together  with  (4), 

(14)  h  Nc(a)  <  c. 

From  (13)  and  (14),  our  theorem  follows. 

Corollary  1.     h  c  c  NC  -  Nn. 

Proof.     If  c  €  Nn,  we  get  a  contradiction  by  Thm.XI.4.5,  Part  II. 

Corollary  2.     \-  c  =  Den  +  c. 

Proof.     By  the  theorem,  p  e  NC.c  =  Den  +  p.    Then  by  Thm.XI.4.10, 
Cor.  2,  c  =  (Den  +  Den)  +  p  =  Den  +  (Den  +  p)  =  Den  +  c. 
**Theorem  XI.5.5.     [-  c  =  2°^°. 

Proo/.     Put 

(1)  a  =  R{Re  Funct.Arg(/?)  =  PI.Val(i2)  C  {0,1}: 

(En):n  e  Nn:(m):m  e  Nn.w  >  n.  D  .R(m)  =  0). 
Clearly 

(2)  h«^NTBX  =  A.       -  " 

(3)  h  «  ^  NTBX  C  (PI  /V  {0,1}). 

Lemma  1.     [- (PI  A  {0,1})  ^  a  w  NTBX. 

Proof.     Let  7?  e  PI  A  (0,1 } .    Then  by  Thm.XI.1.21,  Part  II, 

(i)  22  e  Funct.Arg(i?)  =  PI.Val(/^)  C  {0,1}. 

Case  1.  (n):n  e  Nn.  D  .(Ew).w  e  Nn.m  >  n.R{m)  =  1.  Then 
R  e  NTBX. 

Case  2.  ~(r2,):W  e  Nn.  D  .(Em).w  e  Nn.w  >  n.R{m)  =  1.  Then  by 
duality,  (En):n  e  Nn:(m):m  e  Nn.m  >  n.  D  .R(m)  5^  1.  However, 
\-n  e  Nn.m  e  Nn.m  >  n.  D  .m  e  PI.  So  by  (i),  n  e  Nn.m  e  Nn.m  >  w.  D  . 
R{m)  =  O.v.-R(m)  =  1.    Hence  R  e  a. 

Lemma  2.     |-  Nc(a)  +  c  =  2°^". 

Proof.  By  (2),  (3),  Lemma  1,  and  Thm.XI.2.9,  \-  Nc(a)  +  c  =  Nc(PI 
A  {0,1}).    So  by  Thm.XI.2.39, 

(i)  [-  Nc(a)  +  c  =  Nc(USC(PI))  A  Nc(USC({0,l})). 

Now  by  Thm.XI.2.67,  Cor.  1,  and  Thm.XI.2.61,  h  Nc(USC({0,l}))  = 
Nc({0,l}).    So 

(ii)  l-Nc(USC({0,l}))  =  2. 

By  Thm.XI.3.30  and  Thm.XI.1.33,  \-  Nc(USC(PI))  =  Nc(USC(Nn)), 
and  by  Thm.XI.4.12,  \-  Nc(USC(Nn))  =  Den.    So 
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h  Nc(a)  +  c  =  2°"". 


Lemma  3.     [-  a  sm  (SC(Nn)  r\  Fin). 
Proof.     We  put 


TF 

=  R^{R 

€  a.jS  =  n{n  e  Nn./2(n 

Clearly 

(i) 

|-  Tf  €  Funct. 

(ii) 

h  Arg(W^)  =  a. 

(iii) 

\-  Val(ir)  C  SC(Nn) 

D). 


Let  RWfi.SW^.  Then  i^,^  €  Funct. Arg(i2)  =  PI  =  Arg(5),  Val(/2)  C 
{0,l},Val(>S)C  {0,l},n(neNn./2(n+l)  =  1)  =  ^  =  n(n  e  Nn.^(n  +  1)  = 
1).  We  wish  to  prove  R  =  S  by  means  of  Thm.X.5.16.  So  let  x  e  PI. 
Then  n  e  Nn.x  =  n  +  1. 

Case  1.     n  e^.    Then  i2(a:)  =  R(n  -\-  1)  =  I  =  S(n  +  1)  =  >S(.r). 

Case  2.  '^  n  e  )S.  Then  72(a;)  ?^  1  and  S(x)  7^  1.  But  Val(i2)  C  {0,1} 
and  Val(5)  C  {0,1}.    So  /^(ar)  =  0  =  &{x). 

Thus  i?  =  ,S.    So  by  (i), 

(iv)  h  ^  e  1-1- 

Now  let  /3  €  Val(IF).  Then  i?  e  a  and  (n):n  e  ^.  =  .n  e  'Nn.R(n  +  1)  =  1. 
By  R  e  a,  we  know  that  n  e  Nn:(m):m  e  Nn.m  >  n.  D  .72 (m)  =  0.  So 
{m):m  e  Nn.w  >  n.  D  .'~  77i  e  l3.  Thus  {■m):m  e  ^  D  m  <  n.  Then  by 
Thm.XI.4.2,  /3  e  Fin.    Thus,  by  (iii), 

(v)  \-  Val(IF)  C  SC(Nn)  n  Fin. 

Now  let  fi  e  SC(Nn)  n  Fin. 

Case  1.  /3  =  A.  If  we  put  R  =  PIlXx(O),  then  clearly  RW0,  so  that 
/?  e  Val(Tr). 

Case  2.  iS  ?^  A.  Then  by  Thm.XI.3.44,  Cor.  4,  n  e  Nn  and  (m):w  e  /3  D 
m  <  n.    Now  define 

R  —  xy((Ez):Z  e  Nn.a:  =  z  -\-  l-.z  e  fi.y  =  l.v.'^  ^  e  ^.y  =  0). 

Clearly  R  e  Funct.Arg(i?)  =  PI,Val(/2)  C  {0,1  }.(m):m  e  Nn.w  >  n  +  1. 
D  .i?(m)  =0.  So  72  e  a.  Also,  /3  =  n(n  e  Nn.i2(n  +  1)  =  1).  So  7217^. 
Thus,  in  this  case  also  (3  e  Val(IF). 

Thus  we  have 

(vi)  I-  SC(Nn)  n  Fin  C  Val(TF). 

Then  our  lemma  follows  by  (iv),  (ii),  (v),  and  (vi). 
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Now  by  Lemma  3  and  Thm.XI.4.18,  |-  a  e  Den.  So  \-  Nc(a)  =  Den. 
So  by  Lemma  2,  [-  Den  +  c  =  2°"°.  Then  by  Thm.XL5.4,  Cor.  2,  our 
theorem  follows. 

Corollary  1.     |-  Den  <  c. 

Proof.     Use  Thm.XL4.12,  Cor.  9. 

Corollary  2.     h  c  =  Nc(SC(Nn)). 

Proof.     Use  Thm.XL4.12,  Cor.  8. 

Corollary  3.     h  c"  5^  A. 

Proo/.     Use  Thm.XL4.12,  Cor.  10,  and  Thm.XL2.62,  corollary. 

Corollary  4.     h  c°  e  NC. 

CoroUary  5.     \- c"  =  1. 

Corollary  6.     h  0'  =  0. 

Corollary  7.     \-  c^  ^  c. 

Corollary  8.     ^  V  =  1. 

Corollary  9.     \-  c""  =  c  X  c. 

Corollary  10.     [- T  ^  Nc(SC'(Nn)). 

Corollary  11.     [-c  <  2\ 

Corollary  12.     h  («):«  «  c.  =  .USC(a)  e  c. 

Corollary  13.     [-  (q;):q;  e  c.  D  .Can(a). 

Theorem  XI.5.6.     \- c  ^  c°^°. 

Proof.  By  the  preceding  theorem  [-  c  —  2°"".  Then  by  Thm.XI.4.9, 
\-  c  =  2°^°'-'°'=°.    Then  by  Thm.XL2.56,  h  c  =  (2°"")°'^".    Hence  h  c  =  c^^'^ 

Corollary  1.     |-  {m):m  e  Nn.m  9^  0.  D  .c  —  c'". 

Proof.     If  m  €  Nn.m  ^  0,  then  1  <  m   <  Den.    So  by  Thm.XL2.52, 

1     ^        m     .^        Den 

c    <  c     <  c     . 

Corollary  2.     ^  (a) -.a  e  c.  D  .c  =  Nc(Nn  /V  «). 

Proof.  Let  a  e  c.  Then  by  Thm.XL5.5,  Cor.  12,  c  =  Nc(USC(q;)). 
Also,  by  Thm.XL4.12,  Cor.  11,  \-  Den  =  Nc(USC(Nn)).  So  c  = 
Nc(USC(Nn))  A  Nc(USC(a)).    Then  by  Thm.XL2.39,  c  =  Nc(Nn  /V  a). 

Theorem  XI.5.7. 
L  \-  (Den)°^''  =  c. 
IL  \-  (m):m  €  Nn.2  <  m.  D  .m^""  —  c. 

Proof  of  Part  I.  We  have  h  2  <  Den  <  c.  So  by  Thm.XI.2.57, 
|_  2Den  ^  (Den)"^-  <  c^''".    But  |-  c  =  2°^°  and  j-  c  -  c°'=^ 

Proof  of  Part  11.     Similar. 

Theorem  XI.5.8.     \- c  =  c  X  c 

Proof.  We  can  take  m  =  2  in  Thm.XL5.6,  Cor.  1.  Alternatively,  we 
can  note  that  \- c  =  2'^''°  =  2''^°^°'^°  =  2°^=°  X  2°""  ^  c  X  c. 

Corollary  1.     \-  c  —  Den  X  c. 

Corollary  2.     [-  {m):m  e  Nn.m  9^  0.  D  .c  =  m  X  c. 

This  theorem  tells  us  that  there  are  as  many  ordered  pairs  {x,y)  of  real 
numbers  x  and  y  as  there  are  real  numbers.    If  we  pair  the  ordered  pairs 
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{x,y)  with  the  points  in  the  plane  of  which  x  and  y  are  the  coordinates,  we 
infer  that  there  are  exactly  as  many  points  in  the  real  plane  as  in  the  real 
line.  Alternatively,  we  can  pair  (x,y)  with  the  complex  number  x  +  iy  and 
infer  that  there  are  exactly  as  man}^  complex  numbers  as  real  numbers. 

Let  us  now  take  a  to  be  the  class  of  all  complex  numbers  in  Cor.  2  of 
Thm.XI.5.6.  Each  member  of  Nn  /V  a  is  a  function  R  with  Arg(R)  =  Nn 
and  Ysi\(R)  Q  a.  That  is,  R{0),  R{1),  R{2),  ■  ■  .  constitute  a  secjuence  of 
complex  numbers.  That  is,  Nn  /\-  a  is  the  class  of  all  sequences  of  complex 
numbers.  So  there  are  as  many  sequences  of  complex  numbers  as  there  are 
real  numbers. 

Let  us  look  at  a  fixed  point  Zq.  Each  function  which  is  analytic  in  the 
neighborhood  of  Zq  determines  a  sequence  of  complex  numbers,  namely,  the 
coefficients  of  the  Taylor  series  expansion  of  the  function  at  the  point.  So 
the  number  of  functions  analytic  in  the  neighborhood  of  z^  is  no  greater 
than  the  number  of  real  numbers.  Nor  is  it  less,  as  one  can  see  by  consider- 
ing the  constant  analytic  functions  whose  constant  value  is  a  real  number. 
So  there  are  c  functions  which  are  analytic  in  the  neighborhood  of  Zq.  Since 
there  are  c  points  Zq,  there  is  a  temptation  to  infer  that  the  total  number  of 
analytic  functions  is  less  than  or  equal  to  c  X  c,  and  hence  that  there  are  c 
analytic  functions,  since  |-  c  ==  c  X  c.  However,  this  inference  seems  to 
require  the  axiom  of  choice,  and  we  postpone  discussion  of  it  until  Chapter 
XIV. 

Theorem  XI.5.9.     I  c  =  c  +  c. 

Proof.     Take  w  =  2  in  Cor.  2  of  Thm.XL5.8. 

Corollary  1.     t~  ^  =  T>er\  +  c. 

Corollary  2.     \-  (m):m  e  Nn.  D  .c  =  m  -{-  c. 

We  defined  c  as  the  cardinal  number  of  all  nonterminating  binary  expan- 
sions with  no  digits  to  the  left  of  the  binary  point.  On  pages  408  and  409, 
we  sketched  a  proof  that  these  are  as  numerous  as  the  real  numbers  x  with 
0  <  a;  <  L  So  there  are  c  such  numbers.  If,  momentarily,  we  write  p  for 
the  cardinal  number  of  the  real  numbers  x  with  0  <  x  <  1,  then  clearly 
p  -\-  1  =  c.  However,  c  =  c  +  1.  So  p  +  1  =  c  +  1,  so  that  by  Thm. 
XI. 2. 12,  p  —  c.  If  now  we  pair  the  real  numbers  y  with  1  <  y  with  the 
real  numbers  x  with  0  <  x  <  1  by  writing  y  =  1/x,  we  infer  that  there  are 
p  y's.  So  the  totality  of  positive  real  numbers  has  the  cardinal  number 
p  -{-  I  -\-  p  which  is  p  +  c  or  c  +  c  or  c.  So  there  are  c  positive  reals.  Thus 
there  are  also  c  negative  reals.  Counting  in  zero,  we  get  c  +  1  +  c  reals 
altogether,  or  exactly  c  real  numbers. 

Thus  c  is  the  cardinal  number  of  all  real  numbers,  and  hence  c  is  the 
cardinal  number  of  all  points  on  a  line.  Thus  c  is  the  cardinal  number  of  the 
linear  continuum.  Then,  as  noted  above,  c  is  also  the  cardinal  number  of 
the  plane  continuum. 
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EXERCISES 

XI.5.1.  Prove  that  not  every  nonterminating  binary  expansion  can  be 
the  expansion  of  a  rational  number.    {Hint.    Use  Ex.XI.4.2.) 

XI.5.2.  Prove  that  c  is  the  cardinal  number  of  all  nonterminating  ter- 
nary expansions  in  which  no  1  occurs,  and  which  have  no  digits  to  the  left 
of  the  ternary  point. 

Ternary  expansions  of  this  sort  are  just  the  ternary  expansions  of  points 
of  the  famous  Cantor  middle-third  set  (except  the  left  end  point  of  the  set. 
See  Titchmarsh,  1939,  Sec.  10.291,  where  this  set  is  called  the  Cantor 
ternary  set). 

XI.5.3.     Prove  h  2"^  <  Nc(V). 

XI.5.4.  Prove  that  c  is  the  cardinal  number  of  the  possible  positions  of 
a  plane  figure  in  the  plane.  {Hint.  A  position  is  uniquely  determined  by  a 
point  and  an  angle.) 

XI.5.6.  Prove  that  there  are  c  real  functions,  /,  of  real  numbers  which 
are  defined  only  at  rational  points.  {Hint.  Let  Tq,  Ti,  rg,  .  .  .  be  the  rational 
numbers.  For  each  /  of  the  kind  in  question  define  a  sequence  R  of  real 
numbers  as  follows : 

R{n)  =  /(r„)  if  r„  e  Arg(/)  and  /(r„)  <  0. 
R{n)  =  f{rn)  +  1  if  r„  e  Arg(/)  and  /(r„)  >  0. 
R{n)  =  i  if  ~  r„  e  Arg(/).) 

XI.5.6.  Prove  that  there  are  c  real  functions  of  real  numbers  which  are 
continuous  in  the  entire  plane.  {Hint.  Each  such  function  is  uniquely 
determined  by  its  values  at  the  rational  points.) 

XI.5.7.  Prove  that  c  is  the  cardinal  number  of  all  real  functions  of  real 
numbers,  /,  such  that  for  each  such  /  there  is  a  circle  such  that  /  is  continu- 
ous and  defined  exactly  on  the  interior  of  the  circle.  (We  say  that  /  is 
defined  on  a  if  Arg(/)  =  a.) 

XI.6.8.  Prove  that  there  are  c  open  sets  in  the  plane.  {Hint.  Let  a  be 
an  open  set.  Enumerate  the  interiors  of  circles  with  rational  radii  and 
rational  centers.  Call  them  Iq,  ly,  I^,  .  .  .  ■  By  strong  induction,  define  a 
function  g  such  that  ^(0)  is  the  first  /„  which  lies  wholly  inside  a  if  a  9^  A, 
and  g{Qi)  =  A  if  a  =  A,  and  g{n  -|-  1)  =  the  first  /„  which  lies  wholly  inside 
a  and  has  a  point  in  common  with  a  —  KJ\g{m)  \  m  e  Nn.m  <  n}  if  the 
latter  set  is  not  null,  and  g{n  -1-  1)  =  A  otherwise.  Then  Arg(gr)  —  0  is  a 
unique  subset  of  the  /'s.) 

,  XI.5.9.  Prove  that  there  are  c  open  sets  on  the  line.  {Hint.  Each  open 
set  a  on  the  line  determines  a  unique  open  set  in  the  plane,  namely,  a  "X.  R, 
where  R  is  the  set  of  all  reals.) 
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XI.6.10.  Prove  that  c  is  the  cardinal  number  of  all  real  functions  of  real 
numbers,  /,  such  that  for  each  such  /  there  is  an  open  set  a  such  that  /  is 
continuous  and  defined  exactly  on  a.  (Hint.  By  the  construction  of 
Ex.XI.5.8,  each  open  set  a  determines  a  unique  sequence  g  such  that  every 
value  g(n)  is  either  A  or  an  /.  Then  the  construction  of  Ex.XI.5.5  deter- 
mines a  unique  sequence  of  reals  for  each  g(n).  Thus  for  each  a  there  is  a 
unique  sequence  of  sequences  of  reals.  But  by  Cor.  2  of  Thm.XI.5.6,  there 
are  c  sequences  of  sequences  of  reals.) 

XI.5.11.    Prove  ^2'  =  c. 

XI.5.12.     Prove  that  there  are  2"  real  functions  of  real  numbers. 

XI.5.13.  Let  J  be  the  set  of  real  numbers  x  with  0  <  x  <  1  and  let 
K  be  the  set  of  rational  numbers  in  J.  Prove  that  there  are  2"  subsets  of  J 
which  include  K. 

XI.5.14.  Prove  that  there  are  2"  real  functions  of  real  numbers  which 
are  continuous  at  all  points  at  which  they  are  defined.  {Hint.  For  each 
subset  in  Ex.XI.5.13  we  can  get  a  continuous  function,  namely,  the  func- 
tion which  is  defined  to  be  zero  at  exactly  the  points  of  this  subset.) 

XI.5.15.  Prove  that  each  class  of  nonoverlapping  open  sets  is  a  count- 
able class.  {Hint.  Each  nonempty  open  set  contains  a  rational  number. 
Then  pair  each  nonempty  set  of  the  class  with  the  first  rational  which  it 
contains.) 

6.  Applications.  Many  of  the  theorems  of  the  present  chapter  are  im- 
portant theorems  in  their  own  right,  particularly  those  on  the  arithmetic  of 
finite  cardinals.  The  various  principles  of  proof  by  induction  and  definition 
by  induction  are  widely  used.  Besides  the  illustrations  already  noted,  we 
shall  give  two  more  instances  of  definition  by  induction. 

The  so-called  "sieve  of  Eratosthenes"  (see  Hardy  and  Wright,  pages  3 
to  4)  for  discovering  the  primes  is  such  an  instance.  We  first  define  aSj  as 
Nn  —  {0,1 } .  Then  we  get  /S^+i  from  S^  by  removing  from  S^  all  multiples 
of  the  least  member  of  aS„.  Clearly,  this  is  an  inductive  definition  of  S^. 
Then  the  nth  prime,  p„,  is  just  the  least  member  of  S„.  For  any  specified 
finite  A'^,  one  can  follow  the  inductive  definition  of  >S„  in  order  actually  to 
list  all  members  of  >S„  r\  m{m  e  Nn.m  <  N).  One  can  then  read  off  the 
values  of  p„  below  the  point  where  p„  >  N.  This,  with  refinements,  is  the 
construction  that  has  been  used  to  construct  large  tables  of  prime  numbers. 

In  Ex.XI.5.2  we  gave  an  explicit  definition  of  the  Cantor  middle-third 
set.  Important  properties  of  this  set  are  most  easily  proved  if  the  set  is 
defined  by  a  process  involving  definition  by  induction.  A  typical  intuitive 
definition  of  the  Cantor  middle-third  set  is  as  follows. 

Divide  the  interval  (0,1)  into  three  equal  parts,  and  remove  the  interior 
of  the  middle  part.  Next  subdivide  each  of  the  two  remaining  parts  into 
three  equal  parts,  and  remove  the  interiors  of  the  middle  parts  of  each  of 
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them;  and  repeat  this  process  indefinitely ,  Let  E  be  the  set  of  points  which 
remain. 

This  definition  can  be  formahzed  as  follows.  Let  So  be  the  set  of  real 
numbers  x  with  0  <  a;  <  L  Then  construct  Sn+i  from  *S„  by  removing  the 
interiors  of  the  middle  thirds  of  all  intervals  occurring  in  Sn-  This  gives 
an  inductive  definition  of  >S„.    Finally  we  set  E  =  C\{Sn\n  f.  Nn}. 

The  ideas  of  finiteness,  countability,  and  noncountability  have  constant 
usage  in  topology.  We  have  indicated  certain  of  these  uses  in  Ex.XI.3.17 
to  Ex. XLS. 22,  inclusive.  By  use  of  Ex.XL3.19,  we  can  get  a  simpler  proof 
of  Thm.IX.8. 12,  as  follows.  Let  x  e  a".  If  now  re  t /3.,S  e  OS,  then  there  is  a 
yW\t]iyea.tje^.  So  by  Ex.XL3.19,  (8  n  a  e  hifin.  This  proof  is  probably 
given  more  often  than  the  proof  which  we  gave  for  Thm.IX.8. 12. 

Actually  limit  points,  x,  of  a  set  a,  are  often  classified  according  to  the 
cardinality  of  /3  n  a.  In  particular,  we  say  that  x  is  a  point  of  condensation 
(Verdichtungspunkt)  if 

{0):x  e  ,8.i8  e  OS.  D  .~(/3  r\  a  e  Count). 

After  Ex.XI.3.20,  we  gave  the  definition  of  compactness  for  a  Hausdorff 
space  (in  older  writings  this  is  called  bi compactness).  A  set  o:  in  a  Haus- 
dorff space  is  called  countably  compact  if  every  infinite  subset  of  a  has  a 
limit  point  in  a  (in  older  writings  the  "countably"  is  omitted).  If  o:  is  S 
itself,  we  say  that  the  space  is  countably  compact.  In  Ex.XI.3.22,  we 
proved  that  every  compact  space  is  countably  compact. 

The  so-called  first  and  second  countability  axioms  for  Hausdorff  spaces 
(see  Lefschetz,  1942,  page  6)  impose  the  conditions  of  being  countable  on 
certain  important  sets.  In  particular,  the  second  countability  axiom  says 
that  there  is  a  countable  set  of  neighborhoods  which  is  equivalent  to  the 
given  set  of  neighborhoods  in  the  sense  of  Ex.IX.8.5.  By  use  of  the  axiom 
of  choice  (see  Chapter  XIV),  one  can  show  that,  if  a  Hausdorff  space  is 
countably  compact  and  satisfies  the  second  countability  axiom,  then  it  is 
compact.  By  taking  the  set  of  neighborhoods  in  the  plane  to  be  the  set  of 
circles  with  rational  centers  and  rational  radii  (see  Ex.XI.4.5),  we  can  show 
that  the  plane  is  a  Hausdorff  space  satisfying  the  second  countability  axiom. 

We  say  that  a  Hausdorff  space  is  separable  if  there  is  a  countable  set 
which  has  at  least  one  member  in  common  with  each  open  set.    In  symbols: 

(Ea):a  e  Count:(/3):/3  €  OS.   D   .a  H  /3  ?£  A. 

By  using  the  axiom  of  choice,  one  can  show  that  a  Hausdorff  space  is 
separable  if  it  satisfies  the  second  countability  axiom.  By  taking  a  to  be 
the  set  of  rational  points  in  the  plane,  we  see  that  the  plane  is  a  separable 
Hausdorff  space. 

There  is  not  uniform  usage  with  regard  to  the  word  "separable."    Some 
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writers  call  a  space  separable  if  and  only  if  it  satisfies  the  second  counta- 
bility  axiom.  Other  writers  express  this  by  saying  that  the  space  is  com- 
pletely separable  or  perfectly  separable. 

Actually,  the  ideas  of  compactness,  separability,  satisfaction  of  counta- 
bility  axioms,  etc.,  are  commonly  defined  in  an  analogous  fashion  for  rather 
more  general  spaces  in  which  only  a  weaker  form  of  Hausdorff's  fourth 
axiom  is  assumed. 

Many  of  these  topological  ideas  carry  over  into  analysis  and  are  reflected 
as  theorems  having  to  do  with  the  cardinality  of  sets.  Thus  the  theorem  of 
Weierstrass,  to  the  effect  that  any  infinite  set  contained  in  a  closed  interval 
of  the  line  has  a  limit  point  in  the  interval,  is  merely  a  statement  that  any 
such  closed  interval  is  a  countably  compact  Hausdorff  space.  Incidentally, 
the  usual  proof  given  for  Weierstrass's  theorem  (see  Hardy,  1947,  page  32) 
is  an  interesting  example  of  the  use  of  the  ideas  of  finite  and  infinite  sets. 

Another  theorem  of  analysis  which  reflects  a  topological  idea  is  the 
Heine-Borel  theorem,  to  the  effect  that,  if  a  bounded  closed  set  on  the  line 
is  covered  by  a  set  X  of  open  sets,  then  it  is  covered  by  a  finite  subset  of  X. 
This  says  in  effect  that  any  nonempty  bounded  closed  set  is  a  compact 
Hausdorff  space. 

The  proof  of  the  Heine-Borel  theorem  is  sufficiently  illustrative  of  the 
ideas  of  this  chapter  that  we  now  sketch  it  in  some  detail.  Let  us  have  the 
bounded  closed  set  a,  covered  by  the  set  X  of  open  sets.  That  is,  X  ^  OS. 
a  ^  U-*^-  We  dismiss  the  trivial  case  where  a  =  A,  and  take  a  and  6  to  be, 
respectively,  the  greatest  lower  bound  and  least  upper  bound  of  a.  By  the 
boundedness  of  a,  a  and  h  exist,  and  by  the  closure  of  a,  a  and  h  are  in  a. 
Moreover,  a  and  b  bound  a  so  that  {x):x  e  a.  D  .a  <  x  <  b. 

Now  define  /3  as  the  set  of  all  points  x  of  a  such  that  some  finite  subset 
of  X  covers  all  points  of  a  in  the  closed  interval  (a,x).    That  is, 

/3  =  x(x  €  a::(En):.fx  Q  Xzju  e  Fin: 


(y):y  e  a.a  <  y  <  X.  D  .y  €  Um)- 


Clearly  a  is  in  /3.  We  define  L  as  the  set  of  all  numbers  which  are  less  than 
or  equal  to  some  member  of  /3  and  R  as  the  set  of  all  numbers  which  are 
greater  than  all  members  of  ,8.  Clearly  a  is  in  L,  and  every  number  greater 
than  b  is  in  R.  So  by  the  Dedekind  theorem  (see  Hardy,  1947,  page  301), 
L  and  R  constitute  a  Dedekind  cut  which  determines  a  number  z  in  the 
interval  (a,6)  such  that 

(i)  (£):£  >  0.  D  .(^y).y  ^I3.z  -  e<y  <z, 

(ii)  {y):y  >  2.  3  .~  1/  €  /3. 

By  (i),  2  is  a  limit  point  of  j8,  and  hence  of  a,  since  (3  Q  a.    But  a  is 
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closed,  so  that  z  e  a.  Then  z  is  in  some  member  7  of  X,  and  since  7  is  an 
open  set,  some  neighborhood  w{z  —  z  <  w  <  z  -\-  z)  oi  z'ls,  included  in  7. 
That  is, 

(iii)  {w):Z  —  z<w<z-\-z.  D.wey. 

We  first  show  that  z  is  in  /3.  By  (i),  there  is  a  y  with  y  e  p.z  —  e  <  y  <  z. 
Then  by  the  definition  of  ^,  some  finite  subset  /x  of  X  covers  that  part  of  a 
in  the  interval  (a,y).  By  (iii),  7  covers  that  part  of  a  in  the  interval  (y,z), 
except  perhaps  the  point  y.  So,  together,  fx  and  7  cover  that  part  of  a  in 
the  entire  interval  (a,z).  That  is,  the  finite  set  m  ^  It}  covers  all  points  of 
a  in  the  closed  interval  (a,z).  This,  with  z  e  a  and  the  definition  of  /3,  gives 
2  €  jS.  Furthermore,  z  must  be  h,  for  if  not,  one  can  get  a  contradiction  as 
follows. 

Case  1.  Let  there  he  a.  win  a  with  z  <  w  <  z  -\-  e  and  w  <  b.  Then  by 
(iii),  7  covers  the  interval  (z,w).  So  the  finite  set  /x  ^  {t}  covers  the  points 
of  a  in  the  interval  (a,w).    Then  w  is  in  /3,  contradicting  (ii). 

Case  2.  Let  there  be  no  such  w.  Then  z  -\-  e  <  b.  Take  y  to  be  the 
greatest  lower  bound  of  the  closed  set  consisting  of  those  members  of  a  in 
the  closed  interval  (z  +  e,b).  Then  there  are  no  members  of  a  between  z 
and  V.  Also  v  is  in  some  5  of  X.  Then  /x  W  {7}  W  {8}  covers  all  points  of  a 
in  the  closed  interval  (a,v).    Then  v  is  in  ^,  contradicting  (ii). 

Since  z  is  &,  we  have  those  points  of  a  in  the  interval  (a,b)  covered  by  the 
finite  set  ju  ^  {t!,  and  our  theorem  is  proved. 

There  occur  from  time  to  time  other  cases  in  analysis  in  which  there  is 
interest  in  whether  a  set  is  finite  or  infinite,  such  as  Picard's  theorems 
(Titchmarsh,  1939,  pages  282  to  283),  or  in  which  this  point  is  relevant  in  a 
proof  (see  Lemma  1  on  page  356  of  Titchmarsh,  1939). 

A  rather  startling  result  concerning  the  use  of  finite  and  infinite  is  given 
by  Visser  (see  Visser,  1937).  The  basic  theorem  used  by  Visser  is  a  special 
case  of  a  result  given  earlier  by  Ramsey  (see  Ramsey,  1930). 

We  need  hardly  warn  the  reader  that  the  use  of  00  in  such  places  as 

lim  (• 

y-^o  \y. 


/: 


dy  =  1, 


etc.,  has  nothing  to  do  with  the  notion  of  infinite  set  but  is  concerned  with  a 
limiting  process  involving  a  number  which  is  unbounded. 

In  the  theory  of  Lebesgue  measure  and  Lebesgue  integration  (see  Titch- 
marsh, 1939,  Chapter  X  to  Chapter  XII,  inclusive),  the  notion  of  denumer- 
able  classes  plays  a  central  role.    Indeed,  the  exterior  measure  of  a  set  E  is 
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defined  by  essentially  the  following  device.  Choose  denumerable  sets,  X, 
of  open  intervals  such  that  no  two  members  of  X  overlap  and  X  covers  E. 
One  can  easily  attach  a  number,  m(X),  to  each  such  X  which  is  properly 
described  as  the  sum  of  the  lengths  of  the  members  of  X.  Then  the  exterior 
measure  of  E  is  the  greatest  lower  bound  of  the  w(X)  for  all  X  of  the  sort  in 
question. 

Finally,  in  the  theory  of  general  point  sets  on  the  line,  or  in  the  plane, 
theorems  dealing  with  cardinality  are  numerous.  For  example,  if  a  is  a 
set  of  real  numbers  and  a  —  a,  then  Nc(q:)  =  c  (see  Townsend,  1928, 
page  48). 

EXERCISE 
XI.6.1.     Prove  the  Heine-Borel  theorem  in  the  plane. 


CHAPTER  XII 
ORDINAL  NUMBERS 

I.  Ordinal  Similarity.  When  two  classes  are  similar,  they  have  the  same 
number  of  elements,  and  conversely.  With  ordered  sets,  the  interest  is 
usually  in  whether  they  have  the  same  kind  of  order;  that  is,  whether  they 
are  ordinally  similar.  To  be  ordinally  similar,  they  must  not  merely  be 
similar.  One  must  be  able  to  make  their  elements  correspond  in  such  a  way 
that  the  order  relation  between  elements  in  one  set  is  "preserved"  by  the 
correspondence  when  we  pass  over  to  the  other  set.  To  be  painfully  explicit, 
if  a  precedes  h  in  one  set,  then  the  element  corresponding  to  a  must  precede 
the  element  corresponding  to  b  in  the  second  set,  and  vice  versa. 

As  we  consistently  deal  with  the  ordering  relations  rather  than  with  the 
ordered  sets,  we  define  ordinal  similarity  between  the  ordering  relations 
rather  than  between  the  ordered  sets.  Ordinal  similarity  between  ordering 
relations  necessarily  entails  ordinal  similarity  between  the  ordered  sets 
determined  by  the  ordering  relations.    So  we  define 

P  smoTj,  Q        for         P,Q  e  Rel:AV(P)  sm«  AY {Q):(x,y).xPy  D 

{R(x))Q(R(y)):(x,y).xQy  D   {Rix))P(R(y)). 

smor  for        PQ(ER).P  smor^  Q. 

For  stratification  of  P  smor^j  Q  it  is  necessary  and  sufficient  that  P  = 
Q  =  R  he  stratified.  The  term  smor  is  stratified  and  contains  no  free 
variables  and  so  may  be  assigned  any  type. 

Note  that  P  and  Q  need  not  be  ordering  relations  to  be  ordinally  similar. 
However,  ordinal  similarity  is  of  interest  mainly  between  ordering  relations. 

Theorem  XII.1.1. 
I.  [-  3 (smor). 

II.  |-  smor  €  Rel. 

III.  \-  (P,Q):P  smor  Q.  =  .(ER).P  smor^e  Q. 

*IV.  \-  (P,Q)::P  smor  Q.:  ^  :.(ER):.P,Q  e  RebAVfP)  sm«  AV(Q):(x,y). 
xPy  D  (R(x))Q{R(y)):{x,y).xQy  D  {R(x))P{R(y)). 
Theorem  XII.1.2.     \-  {P):P  e  Rel.  D  .P  smor  P. 
Proof.     Take  R  to  be  AY (P)]I  in  Thm.XII.1.1,  Part  IV. 
Corollary  1.     \-  Arg(smor)  =  Val(smor)  =  AV(smor)  =  Rel. 
Corollary  2.     |-  smor  e  Ref . 

456 
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Theorem  XII.1.3.     |-  (P,Q,R,S):S  =  R.P  smor«  Q.  D  .Q  smovs  P. 
**Corollary  1.     \-  {P,Q):P  smor  Q.  D  .Q  smor  P. 

Corollary  2.     \-  smor  e  Sym. 

Theorem     XII.1.4.     [-    {P^,P^,P^,R,S,T)'.T     =     R\S.P^     smor^     P^. 
P2  smors  P3.  D  .Pi  smor?.  P3. 
**Corollary  1.     h  {P,Q,R):P  smor  Q.Q  smor  P.  D  .P  smor  P. 

Corollary  2.     |-  smor  e  Trans. 

Corollary  3.     |-  smor  e  Equiv. 

Theorem  XII.1.5.  \-  (P,Q):.P,Q  e  Rel:  D  :P  smor  Q.  =  .RUSC(P)  smor 
RUSC(Q). 

Proof.  If  P  smor^  Q  and  S  =  RUSC(P),  then  RUSC(P)  smor^ 
RUSC(Q).  Conversely,  let  P,Q  e  Rel  and  RUSC(P)  smor^  RUSC(Q). 
Take 

P  =  xy({x}S{y}). 

Then  P  smor^  Q. 

Theorem  XIL1.6.     [-  {P,Q):P  smor  Q.P  e  Ref.  D  .Q  e  Ref. 

Proof.  Let  P  smor  Q  and  P  e  Ref.  Then  by  Thm.XII.l.  1,  Part  IV,  and 
Thm.XI.1.1,  Part  IV,  P,Q  e  Rel,  P  e  1-1,  AV(P)  =  Arg(P),  AV(Q)  = 
Val(P),  and  {x,y).xPy  D  {R{x))Q(R{y)).  Now  let  x  e  AV(Q).  Then 
X  €  Val(P).  So  by  Thm.X.5.3,  Cor.  5,  P(.t)  e  Arg(P).  Then  P(x)  e  AV(P), 
so  that  {R(x))P(R(x)).  Then  (R{R(x)))Q(R(R(x))).  Then  by  Thm. 
X.5.20,  Part  II,  a:Qa:. 

Theorem  XII.1.7.     \-  {P,Q):P  smor  Q.P  e  Trans.  D  .Q  e  Trans. 

Proof.  As  in  the  proof  of  Thm.XII.1.6,  let  P  smor^j  Q  and  P  e  Trans. 
If  now,  xQy.yQz,  then  {R{x))P{Riy)).{R(y))P(R(z)).  So  {R(x))P(R(z)). 
Then  (RiR{x)))Q(R(R(z))).    Then  x'Q^. 

Corollary,     h  (^,Q):^  smor  Q.P  e  Qord.  D  .Q  e  Qord. 

Theorem  XIL1.8.     \-  {P,Q):P  smor  Q.P  e  Antisym.  D  .Q  e  Antisym. 

Corollary,     h  {P,Q):P  smor  Q.P  e  Pord.  D  .Q  e  Pord. 

Theorem  XII.1.9. 
I.  1-  {P,Q,R):.P  smor^  Q:  D  :{x,l3).x  leasts  ^  D  P(a;)  leastg  P"/3: 

(.'c,iS).a:  leastg  ,3  D  P(a;)  leastp  P",8. 
II.  h  iP,Q,R):.P  smor^  Q:  D  :(x,^).a;  min^  /3  D  P(rc)  ming  P"|S: 
(a;,/3).a:;  ming  /3  D  P(a;)  min^  P"/3. 

Proof  0/  Par^  I.  Assume  P  smor^  Q  and  x  leasts  /3.  Then  a:  e  AV(P). 
Then  P(a;)  e  (P"/3)  n  AV(Q)  by  Thm.X.5.15  and  Thm.X.5.3,  Cor.  4.  Now 
let  z  e  (P"/3)  n  AV(Q).  Then  R(z)  e  (P"P"/3)  n  AV(P).  But,  by  Thm. 
X.5.22,  Cor.  1,  P"P"/3  =  ^  n  AV(P).  So  Riz)  e  ^  n  AV(P).  Then,  by  the 
definition  of  x  leasts  )8,  xP{R{z)).    So  {R{x))Q{R{R{z))) .    Then  (P(x))Q2. 

Thus  we  have  proved  {x,^).x  leasts  /3  3  P(a;)  leastg  P"/3.    The  proof  of 
(x,(S).x  leastg  /3  D  P(x)  leastp  tC'^  proceeds  similarly. 
Proof  of  Part  II.     Similar. 
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Theorem  XII.1.10.     |-  {P,Q):P  smor  Q.P  e  Connex.  D  .Q  e  Connex. 

Corollary,     h  {P,Q)'P  smor  Q.P  e  Sord.  D  .Q  e  Sord. 

Theorem  XII.1.11.     \-  (P,Q,R):P  smor^  Q.  D  .P  smor«  Q. 

Theorem  XII.1.12.  \-  {P,Q,R,S,a):P  smorfl  Q.^  =  a]R.  D  .(a1Pta) 
smor^  i(R"a)]Q\(R"a)). 

Theorem  XII.1.13.  |-  (P,Q,R,x):P  smor^  Q.x  e  AV(P).  D  .R" 
Kz{P  -  I)x)  =  z{z(Q  -  I)(R{x))). 

**Corollary  1.     \-  {P,Q,R,S,x,y):P  smor^  Q.x  e  AY{P).S  =  {z{z{P  -  P,x)) 
\R.y  =  R(x).  D  .(seg,P)  smors  (seg„Q). 

Corollary  2.     h  {P,Q,x>'P  smor  Q.x  e  AV(P).    D    .(Ey).y  e  AV(Q). 
(seg^P)  smor  (seg„Q). 
**Theorem  XII.1.14.     [-  {P,Q):P  smor  Q.P  e  Word.  D  .Q  e  Word. 

Theorems  XII.  1.5  to  XII.1.14  inclusive  seem  to  establish  the  fact  that 
ordinal  similarity  between  two  relations  is  possible  only  in  case  they  have 
similar  order  properties. 

EXERCISES 
XII.1.1.     Prove: 

(a)  \-  (P,Q):P  smor  Q.P  e  Sym.  D  .Q  e  Sym. 

(b)  \-  (P,Q):P  smor  Q.P  e  Equiv.  D  .Q  e  Equiv. 

XII.1.2.     Prove  \-  {P):P  smor  A.  =  .P  =  A. 
XII.1.3.     Prove: 

(a)  \-  (x,y).{(x,x)}  smor  {(y,y)\. 

(b)  h  (:c,P):P  smor  {(x,x)}.  D  .(Ey).P  =  {{y,y)}. 

XII.1.4.     Prove  \-  {P,x):P  e  Rel.  D   .P  smor  {yz(Eu,v).uPv.y  =  <.t,m). 

2   =   {X,V)). 

XII.1.5.     Let  P  +,  Q  stand  for 

xy(xPy.y.x  e  AV(P).y  e  A'V{Q).y.xQy). 
Prove : 

(a)  h  (P,Q):P,Q  e  Ref.  D  .P  +.  Q  e  Ref. 

(b)  \-  (P,Q):AV(P)  n  AV(Q)  =  A.P.Q  e  Trans.  D  .P  +,  Q  e  Trans. 

(c)  \-  (P,Q):AV(P)  n  AV(Q)  =  A.P,Q  e  Antisym.  D  .P  +,  Q  6  Antisym. 

(d)  [-  (P,Q):P,Q  e  Connex.  D  .P  +.  Q  e  Connex. 

(e)  h  (P,Q):AV(P)  n  AV(Q)  =  A.P,Q  e  Word.  D  .P  +,  Q  e  Word. 

(f)  h  (P,Q,R,S):P  smor  Q.P  smor  >S.AV(P)  n  AV(P)  =  A.AV(Q)  r\ 

AY(S)  =  A.  D  .(P  +.  R)  smor  (Q  +,  aS). 

(g)  h  (P,Q):AV(P)  n  AV(Q)  =  A.a:  leasto  AV(Q).  D  .P  =  seg.(P  +.  Q). 
(h)    h  (P,Q,R).(iP  +.  Q)  +.  ^)  =  (P  +.  (Q  +.  R))- 

(i)      h  (^).^  +.  A  =  P  =  A  +.  P. 
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XII.  1.6.     Let  P  X.  Q  stand  for 

£y(Eu,v,w,z):U,'w  e  AV(P):v,z  e  AV(Q): 
X  =  {u,v):y  =  {w,z):v(Q  —  I)z.v.v  =  z.uPw. 

Prove : 

(a)  h  (P,Q):P,Q  e  Word.  D  .P  X,  Q  e  Word. 

(b)  h  {P,Q,R):(iP  X.  Q)  Xs  R)  smor  (P  X,  (Q  X.  P)). 

(c)  h  (P,Q,Ry.P  X,  (Q  +,  P)  =  (P  X,  Q)  +,  (P  X,  P). 

(d)  h  (P).P  X.  A  =  A  =  A  X.  P. 

(e)  \-  (P,x):P  e  Rel.  D  .P  smor  ({<x,a;>}  X,  P). 

(f)  h  (P,a:):P  €  Rel.  D  .P  smor  (P  X,  Ka:,x)}). 

2.  Well-ordering  Relations.  We  shall  define  ordinal  numbers  as  equiva- 
lence classes  of  well-ordering  relations  with  respect  to  smor.  Before  doing 
so,  we  wish  to  establish  some  special  properties  of  well-ordering  relations. 

It  turns  out  that  there  is  a  unique  basic  structure  for  all  well-ordered 
sets,  and  any  given  well-ordered  set  is  in  effect  just  an  initial  segment  of 
this  basic  structure.  To  put  it  another  way,  if  one  starts  with  a  first  element 
(and  every  nonempty  well-ordered  set  must  have  a  first  element,  by  the 
descending-chain  condition)  and  builds  up  a  well-ordered  set  by  adding 
further  elements  (and  there  is  always  a  unique  "next"  element  if  one  has 
not  finished,  again  by  the  descending-chain  condition),  one  must  follow  a 
fixed  pattern.  If  one  takes  a  "few"  elements,  one  will  not  go  far  along  the 
basic  structure.  If  one  takes  "many"  elements,  one  will  go  far  along  the 
basic  structure.  Thus  different  sets  may  extend  to  different  points  on  the 
basic  structure,  but  as  far  as  both  extend,  they  must  follow  the  same 
pattern. 

Removal  of  initial  elements  from  this  basic  structure  may  not  produce 
any  essential  difference.  Thus  Nn  and  Nn  —  {0}  are  well-ordered  sets  with 
the  same  order  type,  although  Nn  —  { 0 }  is  got  by  removing  the  first  ele- 
ment of  Nn.  However,  if  one  starts  at  the  beginning  of  this  basic  structure 
and  proceeds  to  different  points,  one  gets  different  order  types.  This  is  the 
essential  import  of  the  next  two  theorems. 

**Theorem  XII.2.1.    \-  {P,Q,y):P  e  Word.Q  C  seg.P.y  e  AV(P).   D   .~ 
(P  smor  Q). 

Proof.     Proof  by  reductio  ad  absurdum.    Assume 

(1)  PeWord, 

(2)  Q  C  seg,P, 

(3)  y  6  AV(P), 

(4)  P  smor«  Q. 
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Now  define  /  by  weak  induction  so  that 

(5)  /(O)  =  y, 

(6)  (n):n  e  Nn.  D  J{n  +  1)  =  R{f{n)). 

Lemma  1.     {n):n  e  Nn.  D  ./(w)  e  AV(P). 

Proof.  Proof  by  weak  induction.  Clearly  our  lemma  is  valid  for  n  =  0. 
Assume  f(w)  e  AV(F).  Then  by  (4),  R{f{n))  e  AV(Q).  Then  by  (6)  and 
(2),  J{n  +  1)  6  AV(F). 

Lemma  2.     {n):n  e  Nn.  D  .(/(ri  +  1)){P  -  I)(f(n)). 

Proof.  Proof  by  weak  induction  on  n.  By  (3)  and  (4),  R{y)  e  AV(Q). 
Thenby  Thm.X.6.12,  Part  3,  and  (2),  (R{y)){P  -  Pjy.  Thus  (/(1))(F  -  /) 
(/(O)),  and  our  lemma  holds  for  n  =  0. 

Assume  {f{n  +  1))(P  -  I){f{n)).  Then  by  Lemma  1  and  (4), 
{R{f{n  +  l)))Q{R{f{n))).  That  is,  by  (6),  (/((n  +  1)  +  l))Q(/(n  +  1)). 
Then  (/((n  +  1)  +  l))P(/(n  +  1))  by  (2).  Also,  if  R{f{n  +  1))  = 
RU{n)),  then  f{n  +  1)  =  /(r^)  by  R  e  1-1.  So  (/((n  +  1)  +  1))(P  -  /) 
(/(n  +  D). 

By  Lemma  1,  Val(/)  =  Val(/)  n  AV(P).  Then  by  (5),  (3),  and  Thm. 
X.6.14,  Part  II,  there  is  a  least  X  in  Val(/).    That  is, 

(7)  X  6  Val(/), 

(8)  {w):w  e  Val(/).  D  .xPw. 

Then  by  (7),  there  is  an  n  with  n  e  Nn.a;  =  f{n).  Then  f{n  +  1)  e  Val(/). 
So  by  (8),  (fin))P{f{n  +  1)).  On  the  other  hand,  (f{n  +  1))(P  -  /)(/(n)) 
by  Lemma  2.  Then,  since  P  e  Antisym  by  (1),  we  get  a  contradiction  by 
Thm.X.6.4,  Part  V.    Thus  our  theorem  follows. 

From  this  theorem  it  follows  that,  if  one  terminates  a  well-ordered  series 
at  two  different  places,  the  resulting  series  are  ordinally  dissimilar.  We 
express  this  precisely  in  the  next  theorem. 

Theorem  XII.2.2.  \-  {P,x,y):P  e  WoTd.x,y  e  AV(P).(seg,P)  smor  (seg^P). 
D  .X  =  y. 

Proof.  Proof  by  reductio  ad  absurdum.  Assume  the  hypothesis  and 
X  9^  y.    Put  Q  =  segxP  and  R  =  segyP.    Then 

(1)  Q  smor  R. 

Since  x,y  e  AV(P)  and  P  e  Connex,  xPy.y.yPx. 

Case  1.  xPy.  Then  x{P  -  I)y,  so  that  x  e  AV(P)  by  Thm.X.6.12,  Part 
III.  Then  seg.P  =  seg.P  by  Thm.X.6.13,  corollary.  That  is,  seg.P  =  Q. 
But  R  e  Word  by  Thm.X.6.14,  Part  V,  so  that  we  get  a  contradiction  by 
Thm.XIL2.1. 

Case  2.     yPx.    One  derives  a  similar  contradiction. 

If  two  well-ordered  sets  are  ordinally  similar,  they  must  extend  equally 
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far  along  the  basic  structure.  Moreover,  there  must  be  a  unique  order- 
preserving  correspondence  between  them,  namely,  that  one  which  puts  into 
correspondence  the  elements  which  are  equally  far  along  the  basic  structure. 

Theorem  XII.2.3.     \-  (P,Q):P,Q  e  Word.F  smor  Q.  D  .{^,R).P  smor^e  Q. 

Proof.  Assume  P,Q  e  Word  and  P  smor  Q.  Then,  by  the  definition  of 
smor,  there  must  be  an  R  such  that  P  smor^  Q.  To  prove  that  R  is  unique, 
let  also  P  smors  Q  and  R  ^  S.  Then  by  Thm.X.5.16  and  duality,  there  is 
an  X  such  that  x  e  Arg{R),  x  e  Arg(*S),  R{x)  ^  S(x).  Let  y  =  Rix),  z  =  S(x), 
T  =  R\S.  Then  by  Thm.XII.1.4,  Q  smor^  Q.  Also,  by  Thm.X.5.20, 
Part  II,  T(y)  -  S{R{Rix)))  =  S(x)  =  z.  Then  by  Thm.XII.1.13,  Cor.  1, 
(segyQ)  smor  (seg.Q).  Then  by  Thm.XII.2.2,  y  =  z,  and  we  have  a 
contradiction. 

Of  two  well-ordered  sets,  we  say  that  the  first  is  shorter  than  the  second 
if  the  first  is  ordinally  similar  to  an  initial  segment  of  the  second.  Or, 
putting  the  definition  in  terms  of  well-ordering  relations,  we  define 

sr         for        PQ(P,Q  e  Word: (Ex) .x  e  AV(Q).P  smor  (seg.Q)). 

Clearly  sr  is  stratified,  and  as  it  has  no  free  variables,  it  may  be  assigned 
any  type. 

Theorem  XII.2.4. 
I.  h  3(sr). 
II.  \-  sr  e  Rel. 

III.  h  (P,Q>..P  sr  Q:  ^  :P,Q  e  Word:(Ex).a;  e  AV{Q).P  smor  (seg.Q). 

IV.  h  {P,x):P  6  Word.a;  e  AV(P).  D  .(seg.F)  sr  P. 
V.  h  {P,Q,R):P  smor  Q.Q  sr  R.  D  .P  sr  R. 

Theorem  XIL2.5.     h  (^)-~(^  sr  P). 

Proof.     Take  Q  to  be  seg.P  in  Thm.XII.2.1. 

Theorem  XII.2.6.     [-  {P,Q,R):P  sr  Q.Q  smor  R.  D  .P  sr  R. 

Proof.  Let  P  sr  Q  and  Q  smor  P.  Then  P,Q  e  Word,  x  e  AV(Q).P  smor 
(seg^Q),  and  Q  smor^  R.  Then  R  e  Word  by  Thm.XII.1.14.  Also,  if  we 
put  y  =  S(x),  then  y  e  AV(P),  and  by  Thm.XII.1.13,  Cor.  1,  (seg.Q)  smor 
(seg„P).    Then  P  smor  (seg„P). 

Theorem  XIL2.7.     \-  {P,Q,R):P  sr  Q.Q  sr  R.  D  .P  sr  R. 

Proof.  Let  P  sr  Q.Q  sr  R.  Then  y  e  AY (R).Q  smor  (seg.P).  Then  by 
Thm.XII.2.6,  P  sr  (seg.P).  Thus  x  e  AV(seg„P).P  smor  (seg,(seg„P)). 
Then  by  Thm.X.6.13,  corollary,  P  smor  (seg^P).  Also  x  e  AV(P)  by  Thm. 
X.6.12,  Part  III. 

Corollary  1.     [-  iP,Q)^(P  sr  Q.Q  sr  P). 

Proof.     Use  Thm.XII.2.5. 

Corollary  2.     \-  iP,Q):P  sr  Q.  D  .~(Q  sr  P). 

Theorem  XII.2.8.  \-  (P):.P  e  Word:  D  ■.{x,y):x{P  -  I)y.  ^  .x,y  e  AV(P). 
(seg^P)  sr  (seg„P). 

Proof.     Let  P  e  Word. 
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First,  let  x{P  -  I)y.  Then  x,y  e  AV(P).  Also,  seg.P  c  Word  and 
seg„P  e  Word.  Likewise  x  e  AV(seg„P).  Also,  seg^P  =  seg^(seg„P).  Then 
(seg.P)  sr  (seg^P). 

Conversely,  let  x,y  e  AV(P)  and  (seg^P)  sr  (seg^P).  Then  z  e  AV(seg„P) 
and  (segxP)  smor  (seg,(seg„P)).  Then  z(P  -  I)y  and  seg,(seg„P)  =  seg.P. 
Thus  x,z  €  AV(P)  and  (seg.P)  smor  (seg.P).  Then  x  =  zhj  Thm.XII.2.2. 
Thus  x{P  -  I)y. 

We  wish  to  show  that,  if  one  has  given  two  well-ordered  sets,  then  either 
they  are  ordinally  similar,  or  one  is  shorter.  We  shall  base  the  proof  on  the 
possibility  of  definition  by  transfinite  induction.  It  is  not  necessary  to  use 
the  possibility  of  definition  by  transfinite  induction  (see  Rosser,  1942),  but 
we  wish  to  prove  this  possibility  anyhow,  and  we  now  do  so. 

Definition  by  transfinite  induction  is  a  generalization  of  definition  by 
strong  induction.  In  both,  we  define  a  function  /  with  Arg(/)  =  a.  In 
strong  induction,  a  is  n{n  e  Nn.m  <  n),  but  in  transfinite  induction,  a  is 
any  well-ordered  set;  say  a  =  AV(P),  where  P  is  a  well-ordering  relation. 
In  both  cases,  we  define  /(x)  in  terms  of  the  values  of  /  for  elements  of  a 
which  precede  x.  Speaking  loosely,  we  define  the  value  J{x)  in  terms  of  the 
earlier  values  of  /.  Since  y  precedes  x  if  and  only  if  y{P  —  I)x,  the  values 
of  /  earlier  than  j{x)  are  just  the  values  of  {y{y{P  —  I)x))]f.  So  we  define 
f(x)  in  terms  of  (y(y(P  —  I)x))]f.  More  generally,  we  let  the  value  of  f{x) 
depend  on  both  x  and  (y(y(P  —  I)x))]f.  That  is,  given  a  term  j(^,/),  we 
require  a  function  /  such  that  Arg(/)  =  a  and 

fix)  =  j{x,(y(y(P  -  i)x))]f). 

We  shall  now  show  that,  under  proper  stratification  conditions,  there  is  a 
unique  /  satisfying  these  conditions.    We  use  the  following  hypothesis. 

Hypothesis  He.  Let  p(x,P)  denote  y{y(P  —  I)x).  Let  P,  R,  S,  x,  f  be 
distinct  variables.  Let  j{x,f)  be  a  term  not  containing  any  occurrences  of 
R  or  S.  If  A  and  B  are  terms,  let  j{A,B)  denote  {Sub  in  j(x,f):  A  for  x, 
B  for  / } ,  where  it  shall  be  understood  that  the  substitutions  indicated  cause 
no  confusion. 

Theorem  XII.2.9.     Assume  Hypothesis  Hg.    Then 

(1)  P  e  Word, 

(2)  R,S  €  Funct, 

(3)  Arg(P)  =  Arg(>S)  =  AV(P), 

(4)  (x):x  e  AV(P).  D  .R(x)  =  j(x,p(x,P)]R), 

(5)  (x):x  6  AV(P).  D  .S{x)  =  j(x,p(x,P)]S) 

yield 

R  =  S. 
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Proof.  Proof  by  reductio  ad  absurdum.  Assume  the  hypotheses  and 
also  R  ^  S.  By  Thm.X.5.16,  there  is  an  x  with  x  e  AV(P).R(x)  ^  S(x). 
So  by  Thin.X.6.14, 

(6)  y  6  Ay(P).R(y)  ^  S(y), 

and  (z):z  e  AY(P).R(z)  9^  S{z).  3  .yPz.    So  by  Thm.X.6.4,  Part  V, 

(7)  {z):z  e  kN{P).z{P  -  I)y.  D  .R{z)  =  S{z). 

Let  u(p(y,P)]R)v.  Then  m  e  p(y,P).uRv.  So  w(P  -  /)?/.t;  =  /?(m).  So 
by  (7),  u(P  -  I)y.v  =  S(u),  and  by  (3),  w  e  Arg(.S).  Then  u  e  p{y,P).uSv, 
and  finally  u(p{y,P)]S)v.  Conversely,  we  can  go  from  u(p(y,P)]S)v  to 
u{p{y,P)\R)v.    Thus 

p{y,P)]R  =  p(y,P)]S. 

Then  by  (6),  (4),  and  (5),  R(y)  =  S{y),  and  we  have  a  contradiction  by 
(6). 

Theorem  XII.2.10.  Assume  Hypothesis  He.  Assume  further  that 
R(x)  =  j{x,p(x,P)]R)  is  stratified.  Then  |-  (P)::P  e  Word.:  D  :.(y):. 
y  e  AV(P):  D  :(Ei2):i2  e  Funct:Arg(P)  =  z(zPy):{x):xPy.  D  .R{x)  = 
j(x,p{x,P)]R). 

Proof.     Proof  by  reductio  ad  absurdum.    Assume  that 

(1)  P  €  Word 

and  that  the  conclusion  is  false.    Then  by  Thm.X.6.14,  Part  H,  there  is  a 
y  such  that 

(2)  y  e  AV(P), 

(3)  ^(ER):R  €  Funct:Arg(/2)  =  z(zPy):(x):xPy. 

D  .R(x)  =  j{x,p{x,P)]R), 

and  if  corresponding  results  hold  for  some  w,  then  yPw.     Then  by  Thm. 
X.6.4,  Part  V, 

(4)  {w):.w{P  -  I)y:  D  :(ER):R  e  Funct:Arg(/2) 

=  z(zPw):(x):xPw.  D  .R{x)  =  j(x,p(x,P)]R). 
Put 

(5)  W  =  R(Ew):.w(P  -  I)y'.R  €  Funct:Arg(i?) 

=  z(zPw):(x):xPw.  D  .R{x)  =  j{x,p{x,P)]R). 

(6)  S  =  \JW. 

Lemma  1.    {R^,R2,fi):R„R:,  t  W .^  C  Arg(i2i)  r\  kx%{R^.  D  .^\R^  =  fi\R2. 
Proof.     Let  Ry,R2  c  W.    Then  for  i  =  1  and  i  =  2,  w,{P  -  I)y:R,  e 
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Funct:Arg(i?0  =  z{zPw,):{x):xPw,.  D  .RM)  =  j{x,p(x,P)]R,).  By  P  e 
Connex,  we  may  say  without  loss  of  generality  that  WiPw2.  Then  Arg(i2i) 
C  ArgiR^).  Putting  Arg(R,)]P\Axg{R,),  R„  and  Arg(i2i)1^2  for  P,  R, 
and  5  in  Thm.XII.2.9  gives  i^i  =  Axg{Ri)\R2.  Then  our  lemma  follows 
byThm.X.4.11,  Part  V. 

Lemma  2.     (Ri,R2,u,Vi,V2):Ri,R2  e  W.uRjVi.uR2V2.  "^  .Vi  =  Vz. 

Proof.  Put  0  =  AYg{Ri)  r\  Arg(i?2)-  Then  from  the  hypothesis  of  the 
lemma,  u{(3]Ri)vi.u(l3]R2)v2.  Thus,  by  Lemma  1,  u(^]Ri)v2.  But  by  (5), 
^]Ri  €  Funct.    So  Vi  =  V2. 

Lemma  3.     (R,u,Vi,V2):R  e  W.uRvi.uSv2.  3  .Vi  =  V2. 

Proof.     If  uSv2,  then  by  (6),  R2  e  W.uR2V2-    Then  ^i  =  V2  by  Lemma  2. 

Lemma  4.     >S  e  Funct. 

Proof.     Obvious  by  (6)  and  Lemma  2. 

Lemma  5.     Arg(*S)  =  p(y,P). 

Proof.  If  w^Sv,  then  R  e  W  and  u  e  Arg(R).  So  w(P  -  I)y.uPw.  Then 
w  €  p(^,P)  by  Thm.X.6.5,  Part  VI.  Conversely,  let  u  e  p(y,P).  Then 
u{P  —  I)y,  so  that  by  (4)  there  is  an  72  in  IF  with  Arg(P)  =  z(zPu).  So 
u  e  Arg(R),  and  hence  u  e  Arg(*S)  by  (6). 

Lemma  6.     (R,^):R  e  IF./3  C  Arg(P).  D  ./3li2  =  ^]S. 

Proof.  Let  P  e  IF./3  C  Arg(P).  If  w(/3l72)v,  then  u  e  ^S.wPy.  Then- by 
(6),  u  e  /S.w^Sy.  So  u{(3]S)v.  If  i^(/3l>S)y,  then  w  e  /3.w>Sz.'.  But  ^  ^  Arg(i2). 
So  i^Piy.    Then  y  =  ly  by  Lemma  3.    So  uRv,  and  finally  u{^]R)v. 

Lemma  7.     (x):x(P  -  I)y.  D  .S{x)  =  j(x,p(x,P)]S). 

Proof.     Use  (5),  (6),  Lemma  3,  Lemma  5,  and  Lemma  6. 

Now  put 

(7)  R  =  Skj  {{y,j(y,S))}. 
Then 

(8)  R  €  Funct, 

(9)  Arg(R)  =  z(zPy), 

(10)  S  =  p(y,P)]R, 

(11)  (x):xPy.  D  .R(x)  =  j{x,p(x,P)]R). 

Then  by  (3),  we  have  a  contradiction. 
"*'*Theorem    XII. 2. 11.     Assume    Hypothesis    Hg.      Assume   further   that 
R{x)   =  j{x,p{x,P)\R)  is  stratified.     Then  \-  (P)::P  e  Word.:   D   :.(ER): 
R  e  Funct:Arg(P)  -  AV(P):(.t):x  e  AV(P).  D  .P(x)  =  i(.T,p(.T,P)lP). 

Proof.     Assume 

(1)  P  e  Word. 
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Define 

(2)  W  =  R{Ey):.y  e  AV(P):/?  e  Funct:Arg(/2) 

=  z(zPy):{x):xPy.  D  .R(x)  =  j(x,p(x,P)]R). 

(3)  S  =\JW. 

As  in  the  proof  of  Thm.XII.2.10,  we  prove  seven  lemmas.  Lemma  5  will 
be  Arg(AS)  =  AV(P).  Then  our  theorem  will  follow  from  Lemmas  4,  5,  and 
7  by  taking  R  to  be  S. 

Theorem  XII.2.12.  Assume  Hypothesis  H5.  Assume  further  that 
R{n  +  1)  =  j{n,R)  is  stratified.    Then 

(1)  .  me  Nn, 

(2)  a  =  n(n  e  Nn.w  <  n), 

(3)  (R,S,n)::R,S  e  Funct.Arg(2^)  C  a.Avg(S)  C  a.n  e  a:. 

{x):x  e  a.x  <  n.  D  Mix)  =  Six).:  D  :.Jin,R)  =  Jin,S) 
yield 

(ER)::R  eFunct.ArgiR)  =  a.Rim)  =  a:.in):nea.  D  .Rin  +  1)  -=  Jin,R). 
Proof.     Assume  the  hypotheses.    In  Thm.Xn.2.11,  put 

P  =   (a1   <.  \a) 
and  take  jixj)  to  be 

ty  ix  =  m.y  =  a:v:(En):n  e  a.x  =  n  -{-  \.y  =  jin,f)). 
Then  we  infer  that  there  is  an  R  such  that 

R  e  Funct, 

Arg(P)  =  a, 

Rim)  =  a, 

(4)  in)'.n  €  a.  D  .Rin  +  1)  =  jin,ixix  e  a.x  <  n))]R). 

Now  let  n  e  o;  and 

S  =   ixix  e  a.x  <  n))]R. 

Then  clearly  R,S  e  Funct.Arg(P)  Q  a.Arg(*S')  C  a.n  e  a:.ix):X  e  a.x  <  n. 
D  .Rix)  =  Six).  So  by  i3),Jin,R)  =  Jin,S).  Then  by  (4),  Rin  +  1)  = 
jin,R).    Thus  our  theorem  is  proved. 

By  analogous  proofs,  one  can  prove  analogues  of  Thms.XIL2.9  to 
XIL2.12  satisfying  other  stratification  conditions.     Thus  if  P({.r})    = 
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j{x,'USC(p{x,P))]R)  is  stratified  and  P  e  Word,  then  one  can  infer  that  there 
is  a  unique  R  such  that 

R  e  Funct, 

ATg(R)  =  USC(AV(P)), 
(x):x  e  AV(P).  D  .R({x})  =  j(x,V^C(p(x,P))]R). 

Just  in  passing,  we  note  that  there  is  a  principle  of  proof  by  transfinite 
induction.  It  is  a  generalization  of  Thm.XI.3.20.  It  is  seldom  used,  since 
in  most  instances  it  is  nearly  as  easy  to  carry  out  the  proof  of  the  principle 
for  the  particular  case  at  hand  as  to  apply  the  principle.  The  principle  is 
embodied  in  the  following  theorem. 

Theorem  XII.2.13.     Assume  that  F{x)  is  stratified.    Then 

(1)  P  e  Word, 

(2)  {x)::x  6  AV{P):.(y):y(P  -  I)x.  D  .F(y).:  D  :.F{x) 

yield 

(x):x  e  AV(P).  D  .F{x).  ^ 

Proof.  Proof  by  reductio  ad  absurdum.  Assume  the  hypotheses  and 
'-^{x):x  e  AV(P).  D  .F{x).  Then  by  Thm.X.6.14,  Part  II,  there  is  an  x 
such  that 

(3)  X  6  AV(P), 

(4)  '^Fix), 

and  (y):y  e  AY(P).^F(y).  D  .xPy.  Then  by  Thm.X.6.4,  Part  V, 
{y):y(P  -  I)x.  D  .F{y).  Then  by  (3)  and  (2),  F{x).  Thus  we  have  a 
contradiction  by  (4). 

We  return  to  our  unfinished  business,  which  consists  of  proving  the  fol- 
lowing theorem. 
**Theorem  XII.2.14.     [-  {P,Q):.P,  Qe  Word:  D  -P  sr  Q.v.P  smor  Q.y.Q  sr  P. 

Proof.     Let 

(1)  P,Q  e  Word. 

Let  p(x,P)  denote  y{y{P  -  I)x).  By  Thm.XII.2.11,  there  is  an  R  such 
that 

(2)  R  e  Funct, 

(3)  Arg(P)  =  AV(P), 

(4)  {x):x  e  AV(P).  D  .R{x)  =  ty  (y  leasto  (AV(Q)  -  Ya\(p(x,P)]R))). 
Write 
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(5)  B{x)  =  Y^\(p(x,P)]R), 

(6)  C(x)  =  AV(Q)  -  B{x). 
Then  we  can  rewrite  (4)  as 

(4)  {x):x  e  AV(P).  D  .R{x)  =  ly  {y  leasto  C(x)). 

Lemma  1.     AV(Q)  -  Ya\{R)  5^  A.  D  .P  sr  Q. 

Proof.  Assume  AV(Q)  -  Val(P)  ?^  A.  Since  B(x)  C  Val(P)  by  (5), 
it  follows  by  (6)  that  (x):x  e  AV(P).  D  .C(x)  9^  A.  So  by  (4)  and  Thm. 
X.6.14,  Part  II, 

(i)  (x):x  e  AV(P).  D  .R{x)  leasto  C(x). 

Also  there  is  a  w  such  that 
(ii)  w  leasto  (AV(Q)  -  Val(P)). 

Then  by  Thm.X.6.4,  Part  V, 
(iii)  (x):x(Q  -  I)w.  D  .x  e  Ya\(R). 

Now  let  x  €  Val(P).  Then  by  (2)  and  (3),  x  =  R(y).y  e  AV(P).  So  by 
(i),  re  e  AV(Q).  Then  by  (ii),  x  9^  w.  Also,  since  w  e  (AV(Q)  -  Val(P)) 
by  (ii),  and  since  (AV(Q)  -  Val(72))  C  C(x)  by  (5)  and  (6),  we  get  w  e  C{x). 
Then  by  (i),  (R(y))Qw.  Thus  a;(Q  -  /)w;.  So  we  have  shown  x  e  Val(P). 
D  .a:(Q  -  I)w.  Then  by  (iii),  (x):a;(Q  -  I)w.  =  .x  €  Val(P).  Thus  by 
Thm.X.6.12,  Part  III, 

(iv)  AV(seg^Q)  =  Val(P). 

Let  x,y  e  AV(P).R(x)  =  R(y).x  ^  y.  Without  loss  of  generality,  we  can 
take  yPx.  Then  y{P  -  I)x.  So  by  (5),  R{y)  e  B(x),  and  by  (6),  ~P(?/)  e 
C{x).    But  by  (i),  R{x)  e  C{x).    So  we  have  a  contradiction.    Thus 

(v)  {x,y):x,y  e  kY{P),R{x)  =  R{y).  D  .x  =  y. 

If  now  xRz.yRz,  then  by  (3),  x,y  e  AV(P).  Also  R{x)  =  z  =  R(y).  So  by 
(v),  x  =  y.    Thus  by  (2), 

(vi)  i?  e  1-1. 

Now  let  xPy.  Then  by  (3),  (2),  and  (i),  R(y)  e  AV(Q).  Also  -'(^(P  -  /)a:) 
by  Thm.X.6.4,  Part  V.  So  ^R(y)  c  B(x)  by  (5)  and  (v).  Thus  R(y)  e 
C{x)  by  (6).  Then  by  (i),  (R(x))Q(R(y)).  Also  R{x),R(y)  e  Val(P).  Then 
by  (iv)  and  Thm.X.6.12,  Part  III,  R{x),Riy)  «  p(w,Q).  Then  (R{x)) 
{seg^Q){R(iy)).    Thus 

(vii)  {x,y):xPy.  D  .(R(x))(seg^Q)(R(y)). 
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Let  x{segM)y-  Then  by  (iv),  x,ij  e  Ys\{R).  So  by  (vi)  and  (3),  R(x),R{y) 
e  AV(P).  So  (R(x))P(R(y))yiR(y))P(R(x)).  If  (R{y))PiR(x)),  then 
(R(R(y)))Q(R{R(x)))  by  (vii).  Thus  yQx.  Then  a:  =  ?/  since  Q  e  Antisym. 
Then  (R{x))PlR(y))  since  P  e  Ref.    Thus,  in  both  cases,  {R{x))P{R{y)).  So 

(viii)  (x,y):x(ses^Q)y.  D  .(^(a;))P(^(y)). 

Now  by  (1),  (vi),  (3),  (iv),  (vii),  (viii),  and  Thm.XII.1.1,  P  smor  (seg„Q). 
Also  by  (ii),  w  e  AV(Q).    So  P  sr  Q. 

Lemma  2.     (Ex):x  e  AV(P).C(a;)  =  A..-  D  :.Q  sr  P. 

Proof.  Assume  the  hypothesis  and  let  x  be  the  least  element  of  AV(P) 
such  that  C{x)  =  A.    Then 


(i) 

X  e  AV(P), 

(ii) 

C(x)  =  A, 

(iii) 

(y):y  6  AY(P).y(P  -  I)x.  D  .C{y)  ^  A. 

Put 

(iv) 

a  =  p(.T,P), 

(v) 

^  =  a]R. 

By  (4) 

and  (iii), 

(vi) 

(?/):?/  e  a.  D  .P(y)  leasto  C(?/). 

By  (3) 

(vii) 

Arg(^)  =  a  =  AV(seg,P). 

By  (5),  (iv),  and  (v),  B{x)  =  Y&\{S).    Then  by  (6)  and  (ii),  AV(Q)  C 
Val(*S).    Also,  by  (vi),  Val(^)  C  AV(Q).    So 

(viii)  Val(*S)  -  AV(Q). 

By  (2),  S  e  Funct.    Thus  we  can  reason  from  (vi)  as  we  reasoned  from 
(i)  in  the  proof  of  Lemma  1  to  infer 

(ix)  {y,z):y,z  €  Arg(seg,P).*S(?/)  =  Siz).  D  .y  =  z, 

(x)  >S  e  1-1, 

(xi)  (y,z):y(seg.P)z.  D  .{S(y))Q(S(z)), 

(xii)  (y,z):yQz.  3  .(S(y))(seg:.P){S(z)). 

Then  our  lemma  follows. 

Lemma  3.     AV(Q)   -  Val(P)   =   A:.ix):x  e  AV(P).   D   .C(.r)   ^  A-.-.   D 
::P  smor  Q. 
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Proof.  Assume 
(i)  AV(Q)  -  YsiKR)  =  A, 

(ii)  (x):x  e  AV(P).  D  .C{x)  7^  A. 

Then  by  (4), 

(iii)  {x):x  e  AV(P).  D  .^Gr)  leasts  C{x). 

By  (i),  AV(Q)  e  Val(P),  and  by  (iii)  and  (3),  Val(P)  C  AV(Q).     So 

(iv)  Val(i^)  =  AV(Q). 

We  can  now  reason  from  (iii)  as  we  reasoned  from  (i)  in  the  proof  of 
Lemma  1  to  infer 

(v)  {x,y):x,y  e  KY{P).R{x)  =  R{y).  D  .x  =  y, 

(vi)  i^  e  1-1, 

(vii)  (x,y):xPy.  D  XR(x))Q(R(y)), 

(viii)  (x,y):xQy.  D  .(Rix))P(R(y)). 

Thus  our  lemma  follows. 

Now  our  theorem  follows  since  at  least  one  of  the  hypotheses  of  Lemmas 
1,  2,  or  3  must  hold. 

EXERCISES 

XIL2.1.     Prove  \-  (P,Q):P  sr  Q.  D  .Nc(AV(P))  <  Nc(AV(Q)). 

XIL2.2.  Prove  h  {P,Q)::P,Q  e  Word.:  D  :.P  sr  Q.y.P  smor  Q:  =  -. 
ix):x  e  AV(P).  D  .(Ey).y  e  AV(Q).(seg.P)  smor  (seg.Q).  (Hint.  To  go 
from  left  to  right,  use  Thm.XII.1.13,  Cor.  1,  and  Thm.X.6.13,  corollary. 
To  go  from  right  to  left,  assume  the  right  side  and  Q  sr  P  and  get  a  contra- 
diction by  Thm.XII.2.1.    Then  use  Thm.XIL2.14.) 

XII.2.3.  Prove  [-  {P,Q)::P  sr  Q.y.P  smor  Q:Q  sr  P.y.Q  smor  P.:  D  :. 
P  smor  Q. 

Xn.2.4.  Prove  \-  {P)::P  e  Word.:  D  :.{x,y):.xPy:  =  :x,y  e  AV(P): 
(seg^P)  sr  (segj,P).v.(seg;,P)  smor  (seg„P). 

XII.2.5.     Prove: 

(a)  \-  Word^smor  =  smorfWord  =  Word  1  smor f Word. 

(b)  |-  Wordlsmor  e  Equiv. 

(c)  |-  Arg (Wordlsmor)  ==  Val (Wordlsmor)  =  AV (Wordlsmor)  =  Word. 

XIL2.6.     Proveh(^').Asr  {(x,x)}. 
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XII.2.7.  Prove  \-  (P,Q):Q  e  Word.P  Q  Q.  D  .~(Q  sr  P).  {Hint.  Let 
Q  smor/j  (seg^^P).    Then  define  /  by 

/(O)  =  X, 

(n):n  €  Nn.  D  ./(n  +  1)  =  i?(/(n)), 

and  proceed  to  a  contradiction  as  in  the  proof  of  Thm.XII.2.1.) 
XII.2.8.     Using  the  definitions  of  Exs.XII.1.5  and  XII. 1.6,  prove 

(a)  h  (P,x)-P  e  Word.~  x  e  AV(P).  D  .P  sr  (P  +,  Ka^,^:)}). 

(b)  h  (P,Q,R):P  e  Word.Q  sr  P.AV(P)  n  AV(Q)  =  A.AV(P)  n  AV(i2)  = 

A.  D  .(P  +.  Q)  sr  (P  +.  R). 

(c)  h  (P,Q,R):P  €  Word.P  5^  A.Q  sr  R.  D  .(P  X.  Q)  sr  (P  X«  P). 

(d)  \-  (P,Q,R):P,Q,R  €  Word.AV(P)  n  AV(Q)  =  A.AV(P)  n  AV(P)  = 

A.(P  +,  Q)  smor  (P  +,  P).  D  .Q  smor  R. 

(e)  h  (P,Q):.P,Q  e  Word:  3  :P  sr  Q.  =  .(EP).P  e  Word.AV(P)  n  AV(P) 

=  A.P  5^  A.Q  smor  P  +.  P. 

3.  Elementary  Properties  of  Ordinal  Numbers.  Since  P  smor  Q  ex- 
presses the  fact  that  P  and  Q  have  the  same  order  structure,  we  can  define 
the  order  of  P  or  Q  by  abstraction  with  respect  to  the  equivalence  relation 
smor.  The  technical  name  for  the  order  of  P,  or  the  order  determined  by  P, 
is  the  "order  type"  of  P.  Thus  we  can  have  the  order  type  of  the  contin- 
uum, which  is  the  equivalence  class  with  respect  to  smor  of  the  relation  < 
for  real  numbers.  Similarly  there  is  the  order  type  of  the  rationals,  the 
order  type  of  the  integers,  the  order  type  of  the  positive  integers  (namely, 
the  order  type  of  sequences),  etc. 

In  the  present  section  we  shall  restrict  attention  to  the  order  types  of 
well-ordered  sets.  Such  order  types  are  called  ordinal  numbers.  So  we 
define  the  ordinal  number  No(P)  of  a  well-ordering  relation  P  by  abstrac- 
tion with  respect  to  Wordjsmor,  namely, 

No(P)         for         (Word1smor)"{P}. 

Clearly  No(P)  is  stratified  if  and  only  if  P  is  stratified,  and  must  have 
type  one  higher  than  the  type  of  P  if  it  is  stratified. 

We  have  taken  No(P)  to  be  the  equivalence  class  of  P  relative  to 
Word^smor.  If  P  is  a  well-ordering  relation,  we  shall  have  P  e  No(P). 
Otherwise  No(P)  =  A. 

We  now  define  the  set  of  ordinal  numbers  as  the  set  of  equivalence 
classes  with  respect  to  Wordlsmor,  namely, 

NO         for        EqC  (Wordlsmor). 

NO  is  stratified,  and  has  no  free  variables,  and  so  may  be  assigned  any 
type. 
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Theorem  XII.3.1. 

I.  I  (P):P  e  No(P).  =  .P  €  Word. 
II.  [-(Py.P  €  Word.  =  .No(P)  ?^  A. 

III.  h  (P,Q):.P,Q  t  Word:  3  :Q  €  No(P).  =  .P  smor  Q. 

IV.  [-  (P,Q):Q  6  No(P).  ^  .P  smor  Q.P,Q  e  Word. 
Corollary  1.     h  {P,Q):P  e  No(Q).  =  .Q  e  No(P). 
Corollary  2.     h  {P,Q,R):P,Q  e  No(P).  D  .P  smor  Q. 
Corollary  3.     \-  (P,Q,R):P  e  No(i2).P  smor  Q.  D  .Q  e  No(i2). 
Corollary  4.     h  (^,Q):Q  ^  No(P).  =  .P  smor  Q.P  e  Word. 

Since  j-  Word^smor  e  Equiv,  we  can  apply  the  theorems  of  Sec.  7  of 
Chapter  X  to  get  the  following  theorem. 
Theorem  XII.3.2. 

I.  ^(4>):.<t>  e  NO:  ^  :(EP).P  e  Word.0  =  No(P). 
II.  I-  iP,Q):P  €  Word.P  smor  Q.  =  .P,Q  e  Word.No(P)  =  No(Q). 

III.  \-  ((j>):.(j>  e  NO:   D   :{P):P  e  </..  =  .</.  =  No(P). 

IV.  h  {<t>,e):cl>,d  €  N0.(^  n  ^  F^  A.  D   .</)  =  0. 
V.  \-  (P):P  e  Word.  D  .(Ei<^).</.  e  NO.P  e  0. 

VI.  \-  {<i>):4>  €  NO.  D  .0  5^  A. 
Corollary  1.     |-  {P,Q):.P,Q  e  Word:  D  :P  smor  Q.  =  .No(P)  =  No(Q). 
Corollary  2.     [-  (P):P  e  Word.  =  .No(P)  e  NO. 
Corollary  3.     \-  (P,4>):P  e  <j>.<l>  e  NO.  D  .P  e  Word. 
Theorem  XIL3.3. 

I.  |-  {P,Q,ct>):cf>  e  NO.P,Q  e(t>.  D  .P  smor  Q. 
II.  |-  {P,Q,^):cl>  e  NO.P  6  <A.P  smor  Q.  D  .Q  e  (j>. 

We  now  introduce  the  relations  of  greater  and  less  between  ordinal  num- 
bers.   We  define 

4>d(EP,Q).P  sr  Q.0  =  No(P).5  =  No(Q), 

Cnv(<o), 

<o  W  (N01/), 

Cnv(<o). 

The  terms  <  o,  >  o,  <  o,  >  o  are  all  stratified  and  contain  no  free  variables 
and  so  may  be  assigned  any  type. 

In  any  case  where  it  is  clear  from  the  context  that  we  are  dealing  with 
<o,  >o,  ^0,  and  >o  instead  of  <«,  >c,  <c,  and  >c,  we  shall  omit  the  sub- 
script. 

We  commonly  write  (j)  <  d  <  \{/  ior  <l)  <  $.6  <  \l/,  <j)  <  6  <  xj/  for  (f)  <  6. 
e  <  xp,  (l>  =  d  <  yp  ioY  <i>  =  e.e  <  yp,  etc. 

Theorem  XII.3.4. 

I.  Y  {4>,e):.<t>  <  d:  =  :(EP,Q).P  sr  Q.4>  =  No(P).0  =  No(Q). 
II.  h  (<t>,&):<l>  >  e.  =  .6  <  <j>. 

III.  \-  (4>,e):.(t>  <  6:  =  :(t>  <  e.w.<j>  =  e.<j>  e  NO. 

IV.  \-  l<i>,e):4>  >  e.  ^  .e  <4>. 


<0 

for 

>0 

for 

<o 

for 

>o 

for 
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Corollary.     \-  {4>):(i>  e  NO.  D  .<^  <  0. 

Theorem  XII.3.5.     K<^)-~(<^  <  <A)- 

Proof.     Use  Thm.XII.2.5. 

Corollary.     \-  {(i>,d):4>  <  e.  =  .4>  <  d.(t>  ^  6. 

Theorem  XIL3.6.     [- (0,0,i/'):0  <  6.9  <  4/.  D  4  <  ^. 

Proof.     Use  Thm.XII.2.7. 

Corollary  1.     |-  (4>,e,^p):4>  <  e.Q  <  ^p.  D  .4)  <  yp. 

Corollary  2.     [- ((l>,d,xl/):4>  <  6.6  <  xp.  D  .<i>  <  yp. 

Corollary  3.     K0,^,^):0  <  6.6  <  xp.  D  .(j)  <  \p. 

Corollary  4.     \-(<P,d):4>  <  6.  D  .^(6  <  0). 

Proof.     Use  Thm.XII.3.5. 

Corollary  5.     \-  {<i),6):4>  <  6.  D  .^{6  <  0). 

Corollary  6.     h  (0,^)=^  <  </>•  ^  .~(<^  <  ^)- 

Theorem  XII.3.7.     \-  (6,6) -.(f)  <  6.6  <  (j).  D  .(t>  =  9. 

Proof.  From  6  <  4>,  we  get  ~(<^  <  6)  by  Thm.XII.3.6,  Cor.  6.  So 
(t>  ^  6hy  ThmXII.3.4,  Part  III. 

Theorem  XII.3.8.     \-  ((f>,e):.ct),9  e  NO:  D  :0  <  d.y.cj)  =  9.y.4>  >  6. 

Proof.     Use  Thm.XII.2.14. 

Corollary  1.     [-  {4),9):.(t>,9  e  NO:  D  :4>  <  d.\i.9  <  cf). 

e  NO:  D  :4>  <  9.  ^  .^{6  <  0). 
e  NO:  D  ■.9  <ct>.  =  .^(cj)  <  9). 

Theorem  XII.3.9.  |-  (P,c/>):.</)  <  No(P):  D  '.{^y).y  e  AY(P).cf>  = 
No(seg,P). 

Proof.     Use  Thm.XII.2.4,  Part  III. 

Theorem  XII.3.10.  h  {P,x):P  e  Word.a:  e  AV(P).  D  .No(seg,P)  < 
No(P). 

Proof.     Use  Thm.XII.2.4,  Part  IV. 

Theorem  XII.3.11.  \-  {P,x,y):.P  e  Word:.T,^  e  AV(P):  D  :a:(P  -  T)y. 
=  .No(seg,P)  <  No(seg„P). 

Proof.     Use  Thm.XII.2.8. 

Corollary.  \-  {P,x,y):.P  e  Word:.T,?/  e  AV(P):  D  :?/Pa;.  =  .No(seg„P)  < 
No(seg,P). 

Proof.  By  Ex.X.6.6,  Part  (c),  x{P  -  I)y.  =  .'^{yPx),  and  by  Thm. 
XII.3.8,  Cor.  2,  No(seg.P)  <  No(seg,P).  ^  .~(No(seg,P)  <  No(seg.P)). 

Theorem  XII.3.i2. 
I.  hAV(<o)  =  NO. 
II.  I-  <o  eSord. 

Proof  of  Part  I.  By  Thm.XII.3.4,  Part  I,  and  Thm.XII.2.4,  Part  III, 
y<t><9.D  .0,0  6  NO.  So  by  Thm.XII.3.4,  Part  III,  hAV(<o)  ^  NO.  By 
Thm.XII.3.4,  corollary,  \-  NO  C  AV(<o). 

Proof  of  Part  II.  Use  Thm.XII.3.4,  corollary,  Thm.XII.3.6,  Cor.  3, 
Thm.XII.3.7,  and  Thm.XII.3.8,  Cor.  1. 


Corollary  2.     \-  {4>,9) 
Corollary  3.     [-  {(P,9) 
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**Theorem  XIL3.13.     \-  {P,ci>):P  e  cf>.4>  e  NO.  D  .RUSC'(P)  smor  (seg^  <o). 
Proof.     Assume 

(1)  Pe<l>, 

(2)  0  €  NO. 
Define 

(3)  R  =  £d(^7j).y  e  AY(P).x  -  {{y]}.e  =  No(seg,P). 
By  Thm.IX.6.10,  Cor.  3,  and  Thm.X.4.29,  Part  III, 

(4)  Arg(P)  =  AV(RUSC'(P)). 
By  (1),  (2),  and  Thm.XII.3.2,  Cor.  3, 

(5)  P  e  Word. 
By  (1),  (2),  and  Thm.XII.3.2,  Part  III, 

(6)  4>  =  No(P). 

Let  d  6  Val(P).  Then  by  (3),  y  e  AY(P).d  =  No(seg,P).  So  by  Thm. 
XII.3.10,  d  <  (p.  Then  by  Thm.XII.3.5,  corollary,  d(<  -  I)4>.  Thus 
d  e  AV(seg^<).  Conversely,  let  d  e  AV(seg^<).  Then  6  <  cf>.  Then  by 
Thm.XII.3.9,  y  e  AY(P).d  =  No(seg,P).    Thus  d  e  Yal(P).    So 

(7)  Val(P)  -  AV(seg,<). 

Clearly  R  e  Funct.  Let  xRd.uRd.  Then  y  e  AY{P).x  =  {{y}].d  = 
No(seg,P),  and  v  e  AY(P).u  =  {{v}}.d  ^  No(seg,P).  Then  (seg.P)  smor 
(seg.P)  by  Thm.X.6.14,  Part  V,  and  Thm.XII.3.2,  Part  II.  Then  y  =  vhy 
Thm.XIL2.2.    So  x  ^  u.    Thus 

(8)  P  €  1-1. 

Let  rc(RUSC'(P))w.  Then  x  =  {{y}}Ai  =  {{v}}.yPv.  Then  by  Thm. 
XII.3.11,  corollary,  No(seg,P)  <  No(seg„P).  By  (3),  R(x)  =  No(seg,P) 
andP(w)  =  No(seg,P).  Thus  P (a;)  <  R(u).  Then  by  (7),  (P(a:))(seg^<) 
(P(w)).    So 

(9)  {x,u):x(RVSC'(P))u.  D  .(P(a:))(seg,<)(P(M)). 

Let  e(seg^<)4^.  Then  6  <  <f>.xP  <  4>.d  <  ^.  By  Thm.XII.3.9,  y  e  AV(P). 
d  =  No(seg,P)  and  v  e  AY(P).^P  =  No(seg,P).  Since  6  <  x}^,  we  have  yPv 
by  Thm.XII.3.11,  corollary.  By  (3),  R{d)  =  {{y}}  and  R(4^)  =  {{v}}. 
So  by  Thm.X.3.16,  corollary,  (R{e))(RUSC\P))(R(xP)).    Thus 

(10)  {^,^l.)■.^iseg,<)^p.  D  .(P(^))(RUSC^(P))(P(^)). 
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By  Thm.XII.l.l,  Part  IV,  our  theorem  follows  from  (8),  (4),  (7),  (9), 
and  (10). 

Corollary.     \-  (<t>):.<j>  e  NO:  3  :(P):P  e  </..  =  .RUSC'(P)  smor  (seg^<o). 

Proof.  Let  <f>  e  NO  and  RUSC'(P)  smor  (seg^<).  Then  by  Thm. 
XII.3.2,  Parts  I  and  III,  Q  e<{>.  By  the  theorem,  RUSC'(Q)  smor  (seg^<). 
So  RUSC'(P)  smor  RUSC'(Q).  Then  P  smor  Q  by  Thm.XII.1.5.  Finally 
P  €  «^  by  Thm.XIL3.3,  Part  II. 

In  order  to  carry  out  the  proof  above,  the  formula  defining  R  must  be 
stratified  with  x  and  6  having  the  same  type.    If  we  should  try  to  prove 

(A)  (P,<i>):P  e  <^.0  e  NO.  D  .P  smor  (seg^<) 

in  a  similar  fashion,  we  would  have  to  define  R  —  x6(x  e  AV{P).d  = 
No(segxP)).  However,  the  formula  involved  in  this  case  cannot  be  strati- 
fied with  X  and  6  having  the  same  type,  and  so  we  do  not  know  how  to 
prove  that  such  an  R  exists.  Accordingly,  we  do  not  know  how  to  prove 
the  statement  (A).  In  the  classical  theory  of  ordinals,  there  were  no 
inhibitions  about  the  definition  of  relations  by  unstratified  statements. 
Hence,  in  the  classical  theory  of  ordinals,  one  does  prove  (A),  and  indeed 
it  is  a  key  theorem  in  the  classical  theory  of  ordinals.  In  many  of  the  cases 
in  which  (A)  is  used  in  the  classical  theory  of  ordinals,  we  can  use  Thm. 
XII.3.13  instead.  However,  we  cannot  do  so  in  all  cases,  so  that  appar- 
ently the  classical  theory  of  ordinals  contains  results  which  we  cannot 
prove.  This  is  just  as  well,  because  one  of  the  results  which  can  be  proved 
by  (A)  in  the  classical  theory  of  ordinals  is  the  Burali-Forti  paradox.  We 
are  apparently  spared  this  because  of  inability  to  prove  (A) . 

In  intuitive  mathematics,  it  appears  obvious  that  P  smor  RUSC(P), 
because  we  merely  have  to  pair  x  with  {x}  for  every  x  in  AV(P).  However, 
this  procedure  involves  defining  a  relation  by  an  unstratified  statement 
in  a  way  which  we  do  not  know  how  to  do.  Actually,  if  we  knew  how  to 
prove  P  smor  RUSC(P)  for  general  P,  we  would  get  RUSC(P)  smor 
RUSC'(P),  so  that  we  could  get  P  smor  RUSC'(P).  Then  we  could  get 
(A)  from  Thm.XII.3.13.  Thus  it  appears  unlikely  that  we  can  prove 
P  smor  RUSC(P)  for  general  P.  Nevertheless,  we  shall  prove  this  for 
various  of  the  familiar  P's  of  everyday  mathematics. 

We  now  prove  the  key  theorem  in  the  theory  of  ordinal  numbers. 
**Theorem  XII.3.14.     h  <o  e  Word. 

Proof.     Let  /3  n  NO  5^  A.    Then  4>  e  ^  r\  NO. 

Case  1.     (e):d  e  /3  n  NO.  D  .^(6  <  4>).    Then  </>  min<  fi. 

Case  2.  ^(e):d  e  13  r\  NO.  D  .~(5  <  <^).  Then  there  is  a  0  such  that 
e  el3  r\  N0.5  <  (l>.    Then  6  e  AV(seg^<),  so  that 

(1)  18  nAV(seg^<)  ^  A. 
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By  Thm.XII.3.2,  Part  I  and  Part  III,  P  e  Word.P  e  4>.  Then  RUSC'(P)  « 
Word  by  Thm.X.6.15,  Part  V,  and  RUSC'(P)  smor  (seg^<)  by  Thm. 
XII.3.13.  Thus  (seg^<)  c  Word  by  Thm.XII.1.14.  Then  by  (1),  there  is  a 
^  such  that  ^l/  rmxiR  ^,  where  we  write  R  for  seg^  < .  Then  we  show  \J/  min^  /3 
without  difficulty. 

Thus  in  either  case,  we  find  a  minimal  element  in  /3. 

Not  only  do  we  not  know  how  to  prove  the  statement  (A),  we  can  even 
prove  it  to  be  false. 

Theorem  XIL3.15.     \-  '^iP,<t>):P  e  0.0  e  NO.  D  .P  smor  (seg^<). 

Proof.     Proof  by  reductio  ad  absurdum.    Assume 


(1) 


iP,(j>):P  e  <f).4>  e  NO.  D  .P  smor  (seg^<). 


Now  put</)  =  No(<).  By  Thm.XII.3.14  and  Thm.XII.3.1,  Part  I,  <  e<f>, 
and  by  Thm.XII.3.2,  Part  I,  0  «  NO.  So  by  (1),  <  smor  (seg^<).  Taking 
P  to  be  <,  Q  to  be  seg0<,  and  a:  to  be  0  in  Thm.XII.2.1  gives  a  contradic- 
tion. 

In  the  classical  theory  of  ordinals,  where  statement  (A)  is  provable,  the 
above  result  leads  to  a  contradiction  which  is  known  as  the  Burali-Forti 
paradox. 


Define: 

Oo  for 

lo  for 

0+0  0        for 

0  Xo  0        for 
XII.3.1.     Prove: 


(a) 
(b) 
(c) 
(d) 
(e) 


h  Oo  =  0. 

h  0  e  NO. 
h  (0):0  €  NO. 
\-  (0):0  e  NO. 
[-  (0):0  €  NO. 


XII.3.2.     Prove: 


EXERCISES 


No(A). 
No(KA,A)}). 

^(EP,Q):P  .  0.Q  6  ^.^p  smor  (({(A,A)}   X,  P)  +. 
(KV,V)1  x.Q)). 

}(EP,Q):P  6  0.Q  e  ^.^{^  smor  (P  X.  Q). 


D  .0  <  0. 

D    .0   +0  0    =   0    =   0   +0  0. 

D  .0  Xo0  =  0  =  0  XoO. 


(a)  hlocNO. 

(b)  h  (<t>):<t>  6  NO.  D  .lo  Xo  0  =  0  =  0  Xo  lo. 

(c)  h  i<t>)-<i>  «  NO.  D  .0  +0  lo  e  NO. 

(d)  1-  (0):0  €  NO.  D  .0  <  0  +0  lo. 

(e)  h  (0):.0  c  NO:  D  :(Ee).d  e  NO.0  <  5. 
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Thus  there  is  no  greatest  ordinal. 
XII.3.3.     Prove: 

(a)  \-  i(j>,e):ct>,0  e  NO.  D  .<!>  +o  6  e  NO.  -  - 

(b)  [-{(|>,^,^l^):<|>,e,^e'^O.D.{<t>-\-oB)+o4^  =  cj^+oi^+oi^). 

(c)  h  {4>,e,^p):4>,e,^  e  NO.e  <  V.  ^  .(0  +0  ^)  <  (<^  +0  \i'). 

(d)  \-  {(i>,d):.4>,e  €  NO:   D   :0  <  ^.  =  .(Ei/').;/'  6  NO.;//  9^  Q,d  =  cf)  +0  ^. 

(e)  h  {4>.Q)'-4>,G  e  NO:  D  :0  <  ^.  =  .(E^').^  e  NO.^  =.  ^  +0  ;/.. 

(f)  h  {<t>,e,^):<i>,e,yP  €  NO.0  +0  ^  =  ,^  +0  ;/..   D   .^  =   V'. 

XII.3.4.     Prove: 

(a)  h  {4>,d):(i>,d  e  NO.   D  .cl>  XoO  e  NO. 

(b)  h  {'i>,e,yp):4>,e,^p  e  NO.  D  .(.^  Xo  ^)  Xo  lA  =  </>  Xo  (^  Xo  lA). 

(c)  h  {<i>,e,yp):4>,e,yp  e  NO.  D  .0  Xo  (^  +0  ^)  =  (</>  Xo  ^)  +0  (<^  Xo  yp). 

(d)  h  {4>,G,^):4>,e,yp  e  NO.0  5^  0.^  <  i/^.  D  .{4>  Xo  ^)  <  (0  Xo  ^).       -       '"• 

XII.3.5.  Prove  that  \-  {R):.R  e  Funct.Arg(/2)  =  NO.Val(i?)  C  USC^(V): 
D  :(E<j>,e).<t),e  e  N0.</)  ?^  ^.i^(<A)  =  /2(^).  (Hint.  Suppose,  the  conclusion 
false.  Then  R  e  1-1,  and  we  can  define  a  well-ordering  relation  P  with 
<  smor^  RUSC'(P).  Then,  if  <^  =  No(P),  we  have  RUSC'(P)  smor 
(seg^<),  which  leads  to  a  contradiction.) 

XII.3.6.  Prove  that,  if  B{a)  is  a  term  such  that  x  =  B{{x\)\s,  stratified, 
then  there  is  a  term  A{4>)  such  that  </>  =  { {^(<^)  | }  is  stratified  and 

h  (<A):<^  e  NO.  D  .^(</>)  =  B{{A{d)  I  ^  <  ./,|). 

^  (Hint.     This  is  essentially  a  special  case  of  Thm.XII.2.11  with  different 
stratification  conditions.) 

XII.3.7.     With  A{cj))  and  B{a)  as  in  the  preceding  exercise,  prove 

h  (E(j),d).(t>,e  e  N0.(^  9^  e.A{(l>)  =  A(6). 

(Hint.     Use  Ex.XII.3.5.) 

XII.3.8.  Prove  that,  if  B(a)  and  C{x)  are  terms  such  that  x  =  B{{x}) 
and  x  =  C(x)  are  stratified,  -then  there  is  a  term  A(<^)  such  that  ^  = 
■{{  .4  ((/))} }  is  stratified  and 

\-  {ct>):.<l>  e  NO.~(E0).0  €  NO.0  =  e  +0  lo:   3   :^(0)   =  B({A(e)  |  ^  <  0}). 

h  (</.):</>  e  NO.   D   .^((^  +0  lo)   =  C(A(0)).    ., 

XII.3.9.     With  ^(0),  B(a),  and  C(a;)  as  in  the  preceding  exercise,  prove 
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4.  The  Cardinal  Number  Associated  with  an  Ordinal  Number.     We 

define 

CO         for         No(Nnl  <,  fNn). 

Clearly  co  is  stratified  and  may  be  assigned  any  type. 

Theorem  XII.4.1. 

I.  h  (Nn1  <,  fNn)  e  co. 
II.   \-o}  e  NO. 

Proof.     Use  Ex.XI.3.7. 

Theorem  XII.4.2. 

I.  h  (Nn1  <,  fNn)  smor  RUSC(Nn1  <,  fNn). 

II.  \-  (segc.  <)  €  CO.  ;   ■' 

III.  \-  0(0  <  co)  €  Den. 

Froo/  of  Part  I.     Write 

(1)  P  -  Nn1  <  JNn, 

(2)  Q  =  RUSC(P). 

By  Ex.XI.3.7  and  Thm.X.6.15,  Part  V,  h  P,Q  e  Word.  Then  by  Thm. 
XII.2.14,  P  sr  Q.v.P  smor  Q.y.Q  sr  P. 

Case  1.  Let  P  sr  Q.  Then  a:  e  AV(Q).P  smor  (seg.Q).  So  by  Thm. 
X.4.29,  Part  III,  y  e  Nn.x  =  {^}.  Then  by  Thm.X.6.15,  Part  VH, 
seg.Q  =  RUSC(seg,P).  Then  by  Thm.X.4.29,  Part  III,  AV(seg.Q)  = 
USC(seg,P)  =  USC(m(m  e  Nn.m  <  y)).  But  by  Thm.XII.1.1,  Part  IV, 
AV(P)  sm  AV(seg,Q).  Thus  Nn  sm  \JSC(m(m  e  Nn.w  <  y)).  By 
Thm.XI.3.43,  Cor.  1,  and  Thm.XI.3.38,  USC(m(w  e  Nn.m  <  y))  e  Fin. 
Then  Nn  e  Fin,  contradicting  Thm.XI.3.30,  Cor.  1. 

Case  2.     Let  Q  sr  P.    In  this  case  we  get  a  similar  contradiction. 

So  P  smor  Q. 

Proof  of  Part  II.  Take  P  and  Q  as  in  the  proof  of  Part  I.  By  Part  I 
and  Thm.XII.L5,  RUSe(P)  smor  RUSC(Q).  That  is,  Q  smor  RUSC'(P). 
So  by  Part  I,  P  smor  RUSC'(P).  But  by  Thm.Xn.3.13,  RUSC'(P)  smor 
.(seg„  <).  Thus  P  smor  (seg„  <).  Then  by  Thm.XIL3.3,  Part  II,  (seg„  <) 
e  CO. 

Proof  of  Part  III.  In  the  preceding  proof,  we  had  P  smor  (seg„<). 
ThenAV(P)smAV(seg.<)  by  Thm.XII.1.1,  Part  IV.    SoNnsmJ(0  <  co). 

We  refer  to  the  number  of  ordinals  less  than  0  as  the  cardinal  number 
associated  with  0.    That  is,  we  define 

Card(</))         for         Nc(?(^  <  0)).  " 

Clearly  Card (<^)  is  stratified  if  and  only  if  4>  is  stratified,  and  if  stratified  has 
type  two  higher  than  0. 
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Part  III  of  the  last  theorem  could  be  written 

\-  Card(a))  =  Den. 

Theorem  XII.4.3. 

I.  h  (0,^):0  <e.D  .Card(0)  <  Card(^). 

II.  \-  i<t>,e):.<t>,d  e  NO:  D  :Card(<^)  <  Card (^). v. Card (0)  =  Card(e).v. 
Card(0)  >  Card(e). 
Proof  of  Part  I.     If  0  <  6,  then  by  Thm.XI.3.6,  Cor.  2,  (}{xp  <  (P))  C 
(}(yp  <  d)).    Then  we  use  Thm.XI.2.13,  Part  I. 

Proof  of  Part  II.     Let  <t>,e   e  NO.     Then  by  Thm.XII.3.8,   Cor.    1, 
<l>  <  e.y.d  <  <f). 
Theorem  XII.4.4. 
I.  \-  Den  <  Nc(NO). 
II.  h  (E<^):<^  e  NO.Card(0)  >  Den. 

Proof  of  Part  I.     By  Thm.XII.4.2,  Part  III,  and  Thm.XI.2.13,  Part  I, 

(1)  h  Den  <  Nc(NO). 

We  now  prove  by  reductio  ad  absurdum  that  |-  Den  7^  Nc(NO).  So 
assume  Den  =  Nc(NO).  Then  NO  sm  Nn.  However,  by  Thm.XI.4,12 
and  Thm.XI.1.33,  Nn  sm  USC(Nn)  and  USC(Nn)  sm  USC'(Nn).  Then 
NO  sm  USC^(Nn),  so  that  there  is  an  R  such  that 

(2)  USC'(Nn)  sm^  NO. 
Now  define 

(3)  P  =  mn(m,n  e  Nn./?({  {m} })  <  Ri{  {n} })). 

We  get  AV(P)  =  Nn,  by  Thm.XII.3.4,  corollary.    Then  we  easily  get 

(4)  RUSC'(P)  smor  <o. 

Then  by  Thm.XII.3.14  and  Thm.XII.1.14,  RUSC'(F)  e  Word.  So  P  e 
Word  by  Thm.X.6.15,  Part  V.  Put  <^  =  No(P).  Then  <^  e  NO  by  Thm. 
XII.3.2,  Part  I,  and  P  e  <^  by  Thm.XII.3.2,  Part  III.  Then  RUSC'(P) 
smor  (seg^<)  by  Thm.XII.3.13.  So  by  (4),  (seg^<)  smor  <.  This  is  a 
contradiction  by  Thm.XII.2.1. 

Proof  of  Part  II.     By  Part  I  and  Thm.XI.4.6,  Cor.  2, 

(1)  h  ~N0  e  Count. 

Let  <f>  =  No(<o).  By  Thm.XII.4.3,  Part  II,  Card(0)  <  Card(co).v. 
Card((/))  >  Card(co). 

Case  1.  Card(<^)  <  Card(aj).  But  Card(co)  =  Den  by  Thm.XII.4.2, 
Part  III.    Thus 

(2)  ^d(d  <  <t>)  e  Count 
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by  Thm.XI.4.6,  Cor.  2.  By  Thin.XII.3.13,  RUSC^(<o)  smor  (seg^<). 
Then  USC'(NO)  sm  ^(0  <  0)  by  Thm.XII.1.1,  Part  IV.  Then  by  (2)  and 
Thm.XI.4.1,  Part  V,  USC'(NO)  t  Count.  Finally,  by  Thm.XI.4.12,  Cor. 
12,  NO  e  Count,  and  we  have  a  contradiction  by  (1). 

This  leaves  only  the  possibility  Card(0)  >  Card(a}).  That  is,  Card(<^)  > 
Den  by  Thm.XII.4.2,  Part  III. 

From  Part  II  and  Thm.XII.3.14,  there  is  a  least  ordinal  <^  with  Card(0)  > 
Den.    This  is  often  called  9,.    So  we  define 

fi        for         iiA  {yp  leasts  4>{4>  e  NO.Card(<^)  >  Den)). 

Theorem  XII.4.5. 

I.  h^cNO. 
II.  h  Card(fi)  >  Den. 

III.  h  (0):<^  <  a  3  .Card(0)  <  Den. 

IV.  \-  i4>):<p  €  NO.Card(<^)  <  Den.  D  .<f>  <  U. 

V.  \-o}  <n. 

Proof.     By  Thm.XII.3.14  and  Thm.XII.4.4,  Part  II, 
|-  fi  leasts  i((f)  e  NO.Card(<A)  >  Den). 

Then  Parts  I  and  II  follow,  as  well  as 

(0):(/>  <  12.  D  .~(Card(0)  >  Den), 

Since  Den  =  Card(a)),  we  get  Part  III  by  Thm.XII.4.3,  Part  II. 

Proof  of  Part  IV.  Assume  <^  e  NO  and  '~(<^  <  12).  Then  S2  <  0  by 
Thm.XII.3.8,  Cor.  3.  So  Card(12)  <  Card(«^)  by  Thm.XII.4.3,  Part  I. 
Then  Den  <  Card(<^)  by  Part  II,  so  that  ~(Card(0)  <  Den). 

Proof  of  Part  V.     Use  Part  IV  and  Thm.XII.4.2,  Part  III. 

It  is  common  to  refer  to  the  ordinals  less  than  co  as  ordinals  of  the  first 
class,  and  to  ordinals  greater  than  or  equal  to  w  but  less  than  12  as  ordinals 
of  the  second  class. 

A  common  notation  is  to  write  coo  for  w  and  oji  for  12.  Also  Ki  for  Card  (12). 
The  conjecture  that  Card(12)  =  c  is  known  as  the  continuum  hypothesis. 

Theorem  XII.4.6. 

I.  h  (segQ<)  smor  RUSC(segQ<). 
II.  h  Can(^(^  <  12)). 
III.  f-(segQ<)  cl2. 

Proof  of  Part  I.     This  goes  Hke  the  proof  of  Part  I  of  Thm.XII.4.2.    Put 

(1)  P  =  segQ<, 

(2)  Q  =  RUSC(P). 

Then  P  sr  Q.v.P  smor  Q.y.Q  sr  P. 

Case  1.     Let  P  sr  Q.    Then  x  e  AY (Q).P  smor  (seg.Q).    Thus  0  <  12, 
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X  =  {</>},  and  AV(seg.Q)  =  USC(seg^P)  =  USC(^(0  <  0)).  So  ^(^  <  fi)  sm 
USC(^(e  <  <^)).  However,  by  Thm.XII.4.5,  Part  III,  and  Thm.XI.4.6, 
Cor.  2,  ?(e  <  4>)  e  Count.  So  USC(^(0  <  </>))  e  Count.  Thus  we  have 
a  contradiction  by  Thm.XII.4.5,  Part  II. 

Case  2.     Let  Q  sr  P.    Proceed  similarly. 

So  P  smor  Q. 

Proof  of  Part  II.     By  Part  I,  W  <  ^))  sm  USC(^(0  <  fi)). 

Proof  of  Part  III.  Let  P  e  12  and  Q  =  segs)< .  Then  by  Thm.XII.3.13, 
RUSC'(P)  smor  Q.  However,  by  Part  I,  Q  smor  RUSC(Q).  Then  by 
Thm.XII.1.5,  RUSC(Q)  smor  RUSC'(Q).  Then  RUSC'(P)  smor  RUSC'(Q). 
Then  P  smor  Q  by  Thm.XII.L5.    Then  Q  e  ^  by  Thm.XII.3.3,  Part  II. 

In  Part  II  we  have  proved  Can  (a)  for  another  well-known  class  of  every- 
day mathematics. 

The  reader  who  is  familiar  with  the  classical  theory  of  ordinals  will 
recognize  that  we  have  proved  the  standard  basic  properties  of  Q,  (though 
not  always  by  the  standard  proofs)  except  the  theorem  that  every  denumer- 
able  set  of  ordinals  of  the  second  class  has  a  bound  in  the  second  class.  The 
proof  of  this  seems  to  depend  on  the  denumerable  axiom  of  choice  and  must 
be  postponed  until  Chapter  XIV. 

EXERCISES 

XII.4.1.  Prove  \-  {<i>):.<i>  e  NO:  D  :0  <  w.  =  .{^d).d  <  co.(t>  =  No(sege<). 
{Hint.     Use  Thm.XII.4.2,  Part  II.) 

XII.4.2.     Prove  \-  {P,4>):.P  e  0.<^  e  NO:  D  :</)  <  co.  =  .AV(P)  e  Fin. 

XII.4.3.     Prove  \-  (0):.0  e  NO:  3  :0  <  w.  =  .No(seg^<)  <  co. 

XII.4.4.     Prove  \-  (0):.</>  e  NO:  D  :0  <  w.  =  ,Card((/))  e  Nn.     {Hint. 
Use  the  two  preceding  exercises.) 

XII.4.5.  Prove  \-  {ct>):.(j>  e  NO:  D  :0  <  a  =  .(E0).0  <  ^.<j)  =  No(seg«<). 
{Hint.     Use  Thm.XII.4.6,  Part  III.) 

XII.4.6.  Prove  \-  {P,ct>):P  e  <t>.cj)  <  n.  D  .AV(P)  e  Count.  {Hint.  By 
Thm.XII.3.13,  USC'(AV(P))  sm  ^d{d  <  0).  So  USC'(AV(P))  e  Count  by 
Thm.XII.4.5,  Part  III.) 

XII.4.7.  Prove  \-  {P,ct>):P  e  4>.ci>  e  NO.AV(P)  e  Count.  D  .0  <  fi. 
{Hint.     Use  Thm.XII.3.13  and  Thm.XII.4.5,  Part  IV.) 

XII.4.8.  Prove  \-  (c/)):.<^  e  NO:  D  :0  <  12.  =  .No(seg^<)  <  fi.  {Hint. 
Use  the  two  preceding  exercises.) 

XII.4.9.  Prove  \-  {a){EP).P  e  Word.a  =  AV(P).:  D  :.{m,n):m,n  e  NC. 
D  .m  <  n.v.m  =  n.y.m  >  7i.  {Hint.  Let  a  e  m,  ^  e  n,  a  =  AV(P). 
P  e  Word,  iS  =  AV(Q).Q  e  Word.    Then  use  Thm.XII.2.14  and  Ex.XII.2.1.) 

XII.4.10.  Prove  |-  {a,^)::l3  =  $(EP,7):P  e  Word.7  Q  a.y  sm  AV(P). 
<t>    =    No(P).:    D    :.~(Nc(/3)    <    Nc(USC'(a))).      {Hint.     If   Nc(i8)    < 
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Nc(USC'(«)),  then  there  is  a  7  with  /3  sm  USC'(7).7  Q  «•  Then  the  well- 
ordering  relation  <  on  /3  induces  a  well-ordering  relation  P  with  (/3l  <  \^) 
smor  RUSC'(P)  and  7  =  AV(P).  Then  No(P)  e  /S.  Also  if  we  put 
ff)  =  No(P),  then  (seg^<)  smor  RUSC^(P),  so  that  (seg^<)  smor  (13]  <  f/3). 
As  (seg^<)  =  seg^(,Sl  <  \I3),  we  get  a  contradiction.) 

Note.  Thm.XII.4.4,  Part  I,  is  essentially  a  special  case  of  the  result 
stated  here,  and  the  proof  of  Thm.XII.4.4,  Part  I,  is  a  special  case  of  the 
proof  indicated  here. 

XII.4.11.  Prove  \-  (m,n):m,n  e  NC.  D  .m  <  n.y.m  =  n.y.m  >  n.-.  3  :. 
(a)(EP).P  e  Word.ce  =  AV(P).  (Hint.  Take  /3  as  in  Ex.XII.4.10.  Then 
by  hypothesis,  there  is  a  7  with  a  sm  7.USC^(7)  C  /?.  As  /?  is  well  ordered 
by  < ,  we  get  a  well-ordering  of  7,  and  hence  of  a.) 

XII.4.12.     Prove: 

(a)  h  Card(12)  <  Nc(NO). 

(b)  h  (E</').cA  e  NO.Card((/.)  >  Card(fi). 

Define  coo  as  the  least  ordinal  </>  such  that  Card(</))  >  Card(fi). 
XII.4.13.     Prove: 

(a)  \-  C02  e  NO. 

(b)  hCard(co2)  >  Card(l]). 

(c)  h  (</'):<^  <  ^2.  ^  .Card(<^)  <  Card(fi). 

(d)  h  («/'):'^  e  NO.Card(<^)  <  Card(O).  D  .0  <  ^2. 

(e)  [-  0  <  0)2. 

XII.4.14.     Prove: 

(a)  \-  ((/>):0  =  C02.  D  .(seg^<)  smor  RUSC(seg^<). 

(b)  h  Can(^(0  <  C02)). 

(c)  h  (<^):<^  =  "2.   I)   .(seg^<)  ec/>. 

XII.4.15.  Prove  h  Den  <  Nc(^(a;  <  ^  <  fi)).  (Hint.  Prove  h  (^(E0)r 
^  <  co.d  =  CO  +0  0)  e  Den.) 

XII.4.16.     Prove  h  Card(fi)  =  Nc(^(co  <  d  <  n)). 

XII.4.17.     Prove  h  (0,^):Card(^)  <  Card(0).  D  .6  <  (j>. 

XII.4.18.  Prove  h  («)::«  e  Fin:(0):0  e  a.  D  .6*  <  a:  ID  :.(E0):0  <  ^ 
(0):0  e  a.  D  .^  <  0. 

5.  Applications.  The  applications  of  ordinal  numbers  are  less  common 
than  formerly.  In  topology,  it  is  customary  to  assume  that  all  classes  can 
be  well  ordered.    That  is, 

(«)(EP).P  €  Word.a  =  AV(P). 

Using  this,  many  theorems  were  proved  by  transfinite  induction  and  many 
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definitions  were  made  by  transfinite  induction.  Nowadays,  use  is  com- 
monly made  of  the  more  convenient  results  known  as  Zom's  lemmas  (see 
Chapter  XIV),  and  the  use  of  ordinal  numbers  and  well-ordered  sets  has 
diminished.  However,  because  this  theory  is  well  known  to  the  older 
generation  of  mathematicians,  many  illustrations  and  counterexamples  are 
based  on  it,  and  it  does  not  seem  likely  that  it  will  ever  fall  into  complete 
disuse. 


CHAPTER  XIII 

COUNTING 

1.  Preliminaries.     We  prove  easily: 
Theorem  XIII.  1.1. 

I.  |-  w(m  e  Nn.O  <  m  <  0)  e  0. 
II.  [-  {n):n  e  Nn.w(m  c  Nn.O  <  m  <  n)  e  n.  D  .m(m  e  Nn.O  <  m  <  n  -\-  1) 
en  +  1.  ^ 
Corollary  1.     \-  rn{m  e  Nn.O  <  w  <  1)  «  1. 
Corollary  2.     \-  m(m  t  Nn.O  <  w  <  2)  e  2. 
Corollary  3.     [-  m(w  e  Nn.O  <  w  <  3)  e  3. 
etc. 
It  would  appear  to  be  a  simple  matter  of  proof  by  induction  to  prove 

(A)  (n):n  e  Nn.  D  .m{m  e  Nn.O  <  m  <  n)  t  n. 

However,  the  statement  in  question  is  not  stratified  and  we  apparently 
have  no  means  of  proving  it.  We  can  prove  each  particular  instance  with 
n  =  0,  1,  2,  3,  .  .  .  (see  the  corollaries  to  Thm.XIII.1.1),  since  this  involves 
only  an  induction  about  our  formal  logic  in  the  intuitive  logic.  This  is  not 
the  same  as  an  induction  in  the  formal  logic,  which  is  what  we  need  to  infer 

(A)  from  Thm.XIII.1.1,  and  for  which  (A)  would  have  to  be  a  stratified 
statement. 

If  a  e  Fin,  then  by  Thm.XI.3.28,  Part  I,  there  is  an  n  with 

(B)  n  e  Nn.a  e  n. 
By  Thm.XI.4.7,  there  is  also  an  n  with 

(C)  n  e  Nn.a  sm  m(m  e  Nn.O  <  m  <  n). 

One  would  expect  that  these  two  w's  should  be  the  same,  but  unless  we 
can  prove  (A),  it  does  not  seem  possible  to  prove  that  the  two  n's  are  the 
same.  If  we  have  any  explicit  n,  such  as  0,  1,  2,  3,  ...  ,  then  we  can  use 
the  corollaries  to  Thm.XIII.1.1  to  infer  that  the  n's  of  (B)  and  (C)  are 
the  same.  However,  for  arbitrary  n,  we  apparently  have  no  procedure  for 
identifying  the  two  n's  of  (B)  and  (C) . 

Theorem  XIII.1.2. 
I.   |-  (a):a  e  0.   D   .Can(a). 

II.  f-  {n)::n  e  Nn:.(a):a  i  n.  D  .Can(a:).:  D  :.{a):a  €  n  +  1.  ^  .Can(a). 

483 


484  LOGIC  FOR  MATHEMATICIANS  [Chap.  XIII 

Proof.     Use   Thm.XI.2.65   for   Part   I,   and   Thm.XI.2.67   and   Thm. 
XI.2.61  for  Part  11. 


Corollary  1.     \-  {a):a  el.  D  .Can(Q:). 
Corollary  2.     \-  (a):a  e  2.  D  .Can(a). 
Corollary  3.     h  («)=«  «  3.  D  .Can(a). 
etc. 

We  have  here  a  completely  analogous  situation, 
we  should  be  able  to  prove 

It  would  appear  that 

(D)  (n):.n  €  Nn:  D  :(a):a  e  n.  D  .Can(a:), 

but  this  is  unstratified  and  we  know  no  way  to  prove  it. 

Actuall}^,  (A)  and  (D)  are  equivalent,  so  that,  if  either  could  be  proved, 
then  the  other  could  be  deduced.    This  is  proved  in  the  next  two  theorems. 

Theorem  XIII.1.3.  \-  (n):.n  e  Nn:  3  :(«):«  e  n.  D  .Ca,n{a)::  D  ::{n)i 
n  e  Nn.  D  .7n{m  e  Nn.O  <  m  <  7i)  e  n. 

Proof.  Assume  the  hypothesis  and  n  e  Nn.  Then  there  is  an  a  with 
aeNn,  andsoCan(a).  That  is,  a  sm  USC(a).  Then  USC(a)  sm  USC'(a), 
so  that  USC^(q:)  sm  a.  Then  by  Thm.XI.3.43,  a  sm  m{m  e  Nn.w  <  n). 
Then  by  Ex.XI.3.3,  a  sm  m(m  e  Nn.O  <  m  <  w).  Thus  m{m  e  Nn. 
0  <  m  <  n)  €  n  by  Thm.XI.2.3,  Part  11.^ 

Theorem  XIII.1.4.  |-  (n):n  e  Nn.  D  .m(m  e  Nn.O  <  m  <  n)  t  nr.  D  :: 
{n):.n  €  Nn.-  D  :(a):cx  e  n.  D  .Can(Q:). 

Proof.  Assume  the  hypothesis  and  n  e  Nn  and  a  e  n.  Then  by  Ex. 
XI.3.3  and  Thm.XI.3.43,  USC'(a)  sm  m(?n  e  Nn.O  <  w  <  n),  and  by  the 
hypothesis  and  Thm.XI.2.3,  Part  I,  a  sm  m(m  e  Nn.O  <  m  <  n).    So 

(1)  a  sm  USC'(a). 

By  Thm.XI.3.28,  a  e  Fin,  so  that  by  Thm.XI.3.38,  USC(a)  e  Fin.  Hence 
m  e  Nn.USC(Q:)  e  m.    By  Thm.XI.3.7,  m  <  n.v.m  =  n.v.m  >  n. 

Case  1.  m  <  n.  Then  by  Thm.XI.3.5,  p  e  Nn.n  =  w  +  p  +  1.  So 
0  €  m.y  €  p  +  l.;S  n  7  =  A./5  \J  y  =  a.    Then 

(2)  .  ^Qa, 

and  by  Thm.XI.2.3,  Part  I,  13  sm  USC(a).  Then  USC(/3)  sm  USC'(a),  so 
that  by  (1),  USC(/3)  sm  a.    Then 

(3)  Nc(USC(^))  =  n 

by  Thm.XI.2.3,  Part  II.  By  (2)  and  Thm.IX.6.12,  Part  V,  USC(/3)  C 
USC(a).  So  by  Thm.XI.2.13,  Part  I,  Nc(USC(^))  <  m.  Then  by  (3), 
n  <  m,  and  we  have  a  contradiction. 

Case  2.  m  >  n.  Then  we  get  a  similar  contradiction  bj^  using  Thm. 
IX.fi.  14,  Cor.  2. 
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Thus  we  have  left  only  the  case  m  —  n.  Then  a  sm  USC(a),  so  that 
Can  (a). 

EXERCISES 

XIII.1.1.  Prove  K'^):'/>  <  ^.  ^  .(seg^<)  e</>.:  =  :.{P,^)-.P  €0.0  <  w.  D  . 
PsmorRUSC(P). 

XIII.1.2.  Prove  K<^):<^  <  a  3  .(seg^<)  €0.:  =  :.(P,0):P  €0.0  <  a  3  . 
P  smor  RUSC(P). 

2.  The  Axiom  of  Counting.  We  have  defined  Nn  in  such  a  way  that, 
if  n  e  Nn,  one  would  expect  to  express  the  fact  that  a  has  n  members  by  the 
statement 

(A)  a  e  n. 

In  actual  practice,  one  determines  the  number  of  members  of  a  by  count- 
ing it.  That  is,  one  points  successively  at  members  of  a  and  says  "one", 
"two",  "three",  ....  If  one  says  "ri"  when  one  points  at  the  final  member 
of  a,  one  then  says  that  a  has  n  members.  By  the  act  of  pointing  at  mem- 
bers of  a  and  simultaneously  enunciating  the  names  of  positive  integers,  we 
set  up  a  one-to-one  correspondence  between  the  members  of  a  and  the 
positive  integers  less  than  or  equal  to  n.  That  is,  we  establish  empirically 
that 

(B)  a  sm  m{m  e  Nn.O  <  m  <  n). 

Thus,  in  actual  practice,  it  is  condition  (B)  that  is  taken  as  the  definition 
of  a  having  n  members.  The  theorems  of  mathematics  which  have  to  do 
with  counting  are  all  based  on  (B).  Thus  if  we  wish  to  reproduce  such 
theorems  in  our  symbolic  logic,  we  have  to  arrange  that  (B)  holds  whenever 
a  has  n  members. 

In  this  situation,  we  have  two  alternatives.  One  is  to  abandon  (A) 
completely  and  adopt  (B)  as  the  definition  of  a  having  n  members.  This 
would  be  quite  awkward  with  the  approach  which  we  have  made  to  cardinal 
numbers.  Also,  it  is  intuitively  obvious  that  (A)  should  hold  whenever  (B) 
does,  and  vice  versa. 

The  other  alternative,  which  we  adopt,  is  to  retain  (A)  as  the  definition 
of  a  having  n  members,  and  to  ensure  that  (A)  and  (B)  will  be  equivalent. 
This  we  do  by  adopting  the  axiom  of  counting : 

Axiom  scheme  14.  The  following  statement,  and  each  statement  got 
from  it  by  prefixing  some  set  of  universal  quantifiers,  is  an  axiom : 

{n)\n  e  Nn.  D  .m(jn  e  Nn.O  <  m  <  n)  e  n. 

**Theorem  XIII.2.1.     |-  {a,n):.n  e  Nn:   D   :a  e  n.  =  .a  sm  m(m  e  Nn. 
0  <  m  <  n). 
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Proof.     Use  Thm.XI.2.3. 

Theorem  XIII.2.2.     \-  {a,n):a  e  n.n  e  Nn.  D  .Can(a). 

Proof.     Use  Thm.XIII.1.4. 
**Corollary.     \-  {a):a  e  Fin.  D  .Can(Q:). 

With  this  corollary,  we  can  assert  Can  (a)  for  those  a's  which  occur  with 
reasonable  frequency  in  everyday  mathematics.  Indeed,  we  can  say  more 
than  this.  From  an  intuitive  point  of  view,  it  would  seem  obvious  that  one 
could  pair  the  members  x  of  a,  respectively,  with  the  members  {x\  of 
use  (a).    We  say  that  a  is  strongly  Cantorian  if  this  can  be  done,  namely, 

stCan(a)         for         (E/2):a  sm«  USC(a):(x,2/):a;/22/.  D  .y  =  {x\. 

Theorem  XIII.2.3.     |-  stCan(Nn). 
Proof.     Put 

(1)  P  =  Nn1  <,  fNn, 

(2)  Q  =  RUSC(P). 

By  Thm.XII.4.2,  Part  I,  there  is  an  R  such  that 

(3)  P  smor^  Q.  ' 

Let  xRy.  Then  x  e  Nn.y  e  USC(Nn).  So  z  e  Nn.?/  =  {z}.  Also  by  Thm. 
XII.1.13,^Cor.  1,  (seg^P)  smor  (seg„Q).  Then  AV(seg,P)  sm  AV(seg„Q). 
That  is,  m(m  e  Nn.m  <  x)  sm  USC(m(w  e  Nn.m  <  z)).  But  m(m  e  Nn. 
m  <  z)  €  Fin  by  Thm.XI.3.43^,  Cor.  1.  Then  Can(m(m  e  Nn.m  <  z))  by 
Thm.XIII.2.2,  corollary.  So  m(m  e  Nn.m  <  x)  sm  m{m  e  Nn.7n  <  z).  If 
X  <  z,  then  m{m  e  Nn.m  <  z)  is  similar  to  a  proper  subset  of  itself,  contrary 
to  Thm.XI.3.29,  Cor.  2.  If  a:  >  z,  we  get  a  similar  contradiction.  Then 
X  =  zhy  Thm.XI.3.7.    So  y  =  {x}. 

Thus  Nn  is  strongly  Cantorian.  So,  indeed,  are  the  other  familiar  classes 
of  mathematics,  as  we  show  below. 

Theorem  XIII.2.4.     \-  (a,/3):stCan(a).a  sm  /3.  D  .stCan(/3). 

Proof.  Let  a  sm^  USC(q:)  and  {x,y):xRy.  D  .y  =  {x}.  Let  also 
,3  sms  a.  Put  T  =  ,S|P|Cnv(RUSC(*S)).  Then  by  Ex.X.5.17,  T  e  1-1  and 
{x,y):xTy.  D  .y  =  {x}.  Also,  one  proves  easily  that  Arg(T')  =  /SandVa^T) 
=  USC(i8). 

Theorem  XIII.2.5.     [-  (a,/3):stCan(a)./3  Q  a.  D  .stCan(/5). 

Proof.     Obvious. 

Theorem  XIII.2.6.     [-  («):«  e  Count.  D  .stCan(a). 

Proof.  Let  a  e  Count.  Then  by  Thm.XI.4.7,  corollary,  /3  Q  Nn.a  sm  |8. 
Then  from  Thm.XIII.2.3,  we  get  stCan(/3)  by  Thm.XII.2.5,  and  thence 
stCan(a)  by  Thm.XII.2.4. 

Theorem  XIII.2.7.     \-  (a):stCan(a).  D  .stCan(SC(a)). 


Sec.  3]  COUNTING  487 

Proof.  Let  a  sm^  USC(q:)  and  (x,y):xRy.  D  .y  =  [x].  Define  S  = 
W  Q  «.(E5)./2"/3  =  USC(5).7  =  I  5}).  Since  R"^  =  USC(/3)  if /?  C  «, 
it  is  clear  that  SC(ct)  sm^  USC(SC(a))  and  i0,y):l3Sy.  D  .7  =   \(3}. 

Corollary.     |-  (a) -.a  e  c.  D  .stCan(Q!). 

Proof.     Use  Thm.XI.5.5,  Cor.  2. 

Theorem  XIII.2.8.     [-  (P):P  e  Rel.stCan(AV(P)).  D  .P  smor  RUSC(P). 

Proof.  Let  AV(P)  sm«  USC(AV(P)),  and  (x,y):xRy.  D  .y  =  {x].  Then 
clearly  P  smor«  RUSC(P)  if  P  e  Rel. 

Theorem  XIII.2.9.     |-  stCan(^(^  <  ^))- 

Proof.     Put 

(1)  P  =  segQ<, 

(2)  Q  =  RUSC(P). 

By  Thm.XII.4.6,  Part  I,  there  is  an  R  such  that 

(3)  P  smorfl  Q. 

Let  <pRa.  Then  0  <  fi.^  <  ^a  =  {^}.  Also  by  Thm.XILl.13,  Cor.  1, 
(seg^P)  smor  (seg„Q).  But  seg„Q  =  RUSC(seg,P)  by  Thm.X.6.15,  Part 
VIL  Now  AV(seg9P)  e  Count  by  Thm.XIL4.5,  Part  IIL  So  (segeP)  smor 
RUSC(seg9P)  by  Thm.XIIL2.6  and  Thm.Xin.2.8.  Then  (seg^P)  smor 
(segjP).    Hence  (t>  =  6hy  Thm.Xn.2.2,  so  that  a  =  {0}. 

EXERCISES 

XIII.2.1.     Prove  \-  (a):stCan(a).  =  .stCan(USC(a)). 

XIIL2.2.     With  C02  as  in  Ex.XIL4.13,  prove  \-  stCan(^(0  <  C02)). 

Xin.2.3.     Prove  [-  (0):</)  <  Q.  D  .(seg^<)  e0. 

3.  The  Pigeonhole  Principle.  If  one  has  a  pigeonhole  desk  with  n 
pigeonholes,  and  one  has  n  +  1  bills  filed  in  these  pigeonholes,  at  least  one 
pigeonhole  must  have  two  or  more  bills  in  it. 

Put  more  abstractly,  let  a  have  m  members  and  X  have  n  members.  If 
a  C  (Jx  and  m  >  n,  then  at  least  one  member  of  X  must  have  more  than 
one  member  in  common  with  a.    That  is, 

(EI3):I3  €  X.~  (a  n  |S)  e  0  W  1. 

Here  a  is  the  class  of  bills,  and  X  is  the  class  of  pigeonholes  (speaking 
roughly). 

To  prove  the  principle  in  this  form  requires  the  axiom  of  choice,  as  far 
as  we  know.  Actually,  in  all  applications  that  we  know  of,  it  is  assumed 
further  that  X  is  disjoint.    That  is, 

{oi,0):a,^  e  X.«  5^  /3.  D  .a  n  )S  =  A. 


488  LOGIC  FOR  MATHEMATICIANS  [Chap.  XIII 

In  words,  no  two  members  of  X  overlap.    In  terms  of  pigeonholes,  no  bill 
is  put  in  two  pigeonholes  at  once. 

To  the  best  of  our  knowledge,  the  principle  is  used  only  when  X  €  Fin. 
So  we  prove  the  theorem  for  this  case  only,  although  a  more  general  theorem 
could  be  proved. 
**Theorem  XIIL3.1. 

(1)  Nc(X)  <  Nc(a), 

(2)  a^\j\, 

(3)  X  e  Fin, 

(4)  (/3,7):/3,7  €  X./3  5^  7.   3   ./3  n  7  =  A 

yield 

(Ei3):,S  €  X.~  (a  n  /3)  e  0  W  1. 

Proof.     Assume  the  hypotheses.    Define 

(5)  R  =  h{^x).x  e  a./S  =   [x].x  t  7.7  e  X. 
By  (4),  ■  " 

(6)  R  e  Funct. 
Also  by  (2), 

(7)  Arg(i?)  =  USC(«). 
Also  clearly 

(8)  Val(i2)  C  X. 

Case  1.  a  e  Infin.  Then  by  (7)  and  Thm.XI.3.39,  Cor.  2,  Arg(/2)  e  Infin 
Also  by  (3),  (8),  and  Thm.XL3.31,  Val(i2)  e  Fin.  So  by  Thm.XL3.28 
Cor.  1  and  Cor.  2,  and  Thm.XL3.6,  Cor.  3,  Nc(Val(/?))   <  Nc(Arg(i?)) 

Case  2.  a  e  Fin.  Then  by  Thm.XIII.2.2,  corollary,  Can(a).  So  by  (7) 
Nc(Arg(i2))  =  Nc(a).    Then  by  (1)  and  (7),  Nc(Val(i?))  <  Nc(Arg(i2)) 

So  in  either  case,  Nc(Val(i?))  <  Nc(Arg(/2)).  Then  by  ExXI.2.12 
(E/3,7):/3,7  6  ATg(R).l3  7^  y.Ri^)  =  R(y).  Then  by  (5),  x,y  e  a.x  t^  y 
R{{x])  =  R{\y}).  But  by  (5),  x  6  R{{x]) .R{{x\)  e  X  and  t/  e  R{{y]) 
R{{y])e\. 

Note.  We  have  refrained  from  using  the  results  of  this  chapter  in  sub- 
sequent chapters,  so  that  anyone  who  does  not  find  the  axiom  of  counting 
congenial  may  dispense  with  it  at  the  loss  of  no  more  than  the  theorems  of 
the  second  and  third  sections  of  the  present  chapter. 

4.  Applications.  In  number  theory,  most  theorems  dealing  with  the 
number  of  integers  having  a  given  property  depend  intrinsically  upon 
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counting  them,  and  so  depend  on  the  axiom  of  counting.  The  prime  num- 
ber theorem,  concerning  the  number  of  primes  less  than  or  equal  to  x  is  such 
a  theorem.  To  illustrate  how  this  comes  about,  one  could  look  at  the  proof 
of  Thm.  20  on  page  17  of  Hardy  and  Wright,  where  a  lower  bound  is  ob- 
tained for  the  number  of  primes  less  than  or  equal  to  x.  The  crucial  step 
is  where  they  say  that  there  are  at  most  x/p^  numbers  divisible  by  p'  and 
not  exceeding  x.  If  we  write  n  —  [x/p^],  the  numbers  in  question  are 
1  X  p^,  2  X  p^,  ■  .  ■  ,  n  X  p^.  Clearly  this  set  of  numbers  is  in  one-to-one 
correspondence  with  mim  e  Nn.O  <  w  <  n).  So  by  Axiom  scheme  14, 
there  are  n  such  numbers. 

The  same  type  of  reasoning  appears  at  crucial  points  in  Landau,  1927, 
for  instance,  on  page  194. 

In  an  analogous  fashion,  theorems  involving  the  number  of  zeros  of  a 
function  go  back  to  the  axiom  of  counting.  For  instance,  the  theorem 
(Titchmarsh,  1939,  page  115)  that 


-{ 

2iri  J  c 


f(z) 


equals  the  number  of  zeros  of  f(z)  inside  C  minus  the  number  of  poles  inside 
C  if  f(z)  is  meromorphic  inside  and  on  C. 

Results  on  the  number  of  permutations  and  ccni^i._:.:l  rr  of  sets  of 
elements  also  stem  from  the  axiom  of  counting. 

The  pigeonhole  principle  is  used  directly  in  various  parts  of  mathe- 
matics. For  examples,  see  the  proofs  given  in  Sec.  11.3  and  Sec.  11.12  of 
Hardy  and  Wright. 

Since  much  of  the  classical  theory  of  cardinals  and  ordinals  is  based  on 
the  assumption  (Q:).stCan(a),  one  can  get  many  of  the  classical  theorems  in 
these  domains  by  confining  attention  to  those  cardinals  n  such  that 

n  e  NC: (£«).«  e  n.stCan(Q:) 

and  to  those  ordinals  ^  such  that 

<P  e  NO:(EP).P  e  0.stCan(AV(P)). 


CHAPTER  XIV 
THE  AXIOM  OF  CHOICE 

1.  The  General  Axiom  of  Choice.  Mathematical  opinion  on  the  axiom 
of  choice  is  not  unanimous.  Indeed  there  are  many  shades  of  opinion.  We 
shall  illustrate  some  of  these  by  an  imaginary  conversation  between  three 
mathematicians,  X,  Y,  and  Z.  (In  Sierpinski,  1928,  pages  103  to  109, 
various  mathematicians  holding  divergent  opinions  are  cited  by  name  and 
their  opinions  actually  quoted.  There  is  also  an  extensive  discussion  in 
Zermelo,  1908,  first  paper.) 

'Tt  is  of  course  obvious,"  said  Y,  "that  every  infinite  set  contains  a 
denumerable  subset.  For  choose  an  element  of  the  set  and  call  it  Xq. 
Infinitely  many  elements  remain.  Choose  another  and  call  it  Xi.  Still, 
infinitely  many  elements  remain.  Indeed,  when  x„  has  been  chosen,  infi- 
nitely many  elements  remain,  and  we  can  choose  another  and  call  it  x„+i. 
So  we  can  choose  Xo,  Xi,  x^,  .  .  .  ,  which  form  a  denumerable  subset." 

"Oh,  come  now,"  protested  X.  "How  much  time  do  you  think  you  have 
available  for  all  this  choosing?    At  your  death  you  would  still  be  choosing 

Xn  +  1- 

"I  didn't  say  that  I  personally  could  choose  the  set,"  returned  Y.  "I 
merely  said  it  is  there.  Given  a  being  who  could  choose  the  elements  with 
no  waste  of  time,  he  could  choose  the  entire  set  in  one  second  by  choosing 
Xo  in  half  a  second,  Xi  in  a  quarter  of  a  second,  X2  in  an  eighth  of  a  second, 
and  so  on." 

"There  you  go  with  that  infernal  'and  so  on'  again,"  objected  X.  "Just 
when  you  get  to  the  critical  point  in  a  proof,  you  say  'and  so  on'  and  wave 
your  hands,  and  I'm  supposed  to  be  satisfied.  I  thought  you  were  sup- 
posed to  prove  things,  and  not  appeal  to  the  hypothetical  abilities  of  a 
mythical  being.    You're  doing  theology,  not  mathematics." 

Z  broke  in. 

"As  a  matter  of  fact,  it's  perfectly  clear  that  your  original  set  can  be  well 
ordered.  Why  stop  when  you  have  chosen  denumerably  many  elements? 
Go  right  on  choosing,  attaching  ordinal  numbers  as  subscripts,  until  you 
have  exhausted  the  set.    Then  you  will  have  it  well  ordered." 

"You've  exhausted  me,"  complained  X. 

"Wait  a  minute,"  said  Y.  "If  your  set  is  uncountably  infinite,  then  one 
must  make  an  uncountable  infinity  of  choices.  Adding  up  the  time  inter- 
vals for  each  choice  gives  an  infinite  time  for  the  entire  process." 

490 
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Z  responded:  "You  are  assuming  a  finite  length  of  time  for  each  choice, 
even  though  the  time  may  be  made  as  short  as  desired  for  each  individual 
choice.  But  why  should  a  choice  take  any  time  at  all?  I  hypothecate  a 
being  who  can  perform  a  choice  in  no  time  at  all.  Then  the  entire  well 
ordering  takes  him  no  time  at  all." 

"The  perfect  reductio  ad  absurdum,"  grumbled  X. 

Thus  it  went  far  into  the  night,  with  none  of  X,  Y,  or  Z  retreating  from 
his  respective  position.    These  positions  are  as  follows. 

X's  position:  Any  finite  number  of  choices  is  permissible,  but  not  an 
infinite  number. 

Y's  position :  A  denumerable  number  of  choices  is  permissible,  but  not 
any  larger  number. 

Z's  position:     Any  number  of  choices  is  permissible. 

Among  actual  mathematicians,  perhaps  a  majority  would  applaud  X's 
position,  certainly  some  agree  with  Y,  and  others  (who  perhaps  constitute 
a  minority)  agree  wholeheartedly  with  Z.  However,  many  mathematicians 
who  would  like  to  espouse  the  positions  of  X  or  Y  find  that  this  would  leave 
them  with  no  means  of  proof  for  certain  theorems  which  they  need  in  their 
research.  Accordingly,  they  accept  the  position  of  Z,  but  with  reluctance 
and  a  hope  that  someone  will  one  day  find  proofs  of  their  key  theorems 
which  do  not  involve  an  infinity  of  choices.  Thus  it  comes  about  that  we 
find  papers  written  in  which  all  the  results  of  the  paper  depend  on  a  theorem 
whose  only  known  proof  involves  an  infinity  of  choices;  nevertheless, 
throughout  the  paper  the  author  is  careful  to  avoid  the  use  of  an  infinity  of 
choices. 

A  very  complicating  factor  in  the  situation  is  that  use  of  an  infinite 
number  of  choices  is  often  concealed  by  a  plausible-sounding  exposition. 
Witness  our  "proof,"  after  Thm.XI.4.15,  that  the  members  of  a  denumer- 
able set  of  denumerable  sets  are  denumerable.  The  reasoning  sounds 
entirely  plausible,  but  it  conceals  a  denumerable  number  of  choices  (we 
shall  consider  this  point  later  with  some  care).  Such  implicit  uses  of  an 
infinity  of  choices  are  so  ingrained  in  the  classical  writings  that  it  is  rather 
hard  to  be  sure  which  of  the  classical  theorems  have  been  proved  without 
recourse  to  an  infinity  of  choices.  Presumably  one  could  review  the  entire 
background  of  the  discipline  in  which  one  wishes  to  work  and  check  the 
status  of  the  relevant  theorems.  However,  this  is  a  colossal  task.  More- 
over, some  of  the  uses  of  an  infinity  of  choices  are  concealed  under  such 
extremely  plausible  arguments  that  it  is  very  hard  to  be  sure  of  ferreting 
them  all  out.  Thus  many  mathematicians  accept  the  position  of  Z  out  of 
sheer  desperation,  although  they  have  no  sympathy  with  it  and  some  even 
find  it  repugnant. 

Under  the  circumstances,  there  is  nothing  for  us  to  do  but  indicate  how 
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one  could  introduce  into  sj^mbolic  logic  an  equivalent  of  making  an  infinity 
of  choices.  Then  those  who  wish  or  feel  forced  to  use  an  infinity  of  choices 
on  occasion  will  at  least  have  a  mechanical  equivalent  thereof  (which  will 
not  necessarily  be  any  more  palatable). 

A  prior  question  is  that  of  making  a  single  choice.  This  has  been  dis- 
cussed at  length  in  Sec.  7  of  Chapter  VI.  Use  of  rule  C  is  the  mechanical 
equivalent  of  a  single  act  of  choice.  Since  we  require  that  each  demonstra- 
tion be  of  finite  length,  we  are  thus  entitled  to  make  any  finite  number  of 
choices  but  not  an  infinite  number.  Thus  up  to  now  our  position  has  been 
strictly  that  of  X. 

It  is  interesting  that  justification  for  an  infinity  of  choices  can  now  be 
furnished  by  a  suitable  axiom.  This  axiom  will  be  such  that  it  will  obvi- 
ously take  care  of  the  simplest  instance  of  an  infinity  of  choices,  and  a  suc- 
cession of  theorems  will  show  that  indeed  it  takes  care  of  all  cases. 

Suppose  we  have  an  infinite  class  X  of  sets  a  such  that  each  a  of  X  is 
nonnull  and  no  two  as  from  X  have  a  member  in  common.  Then  each  a 
contains  at  least  one  x,  and  this  x  is  not  in  any  other  a  of  X.  In  such  a 
circumstance,  it  seems  almost  indisputable  that  there  should  be  a  set  of  x's 
with  precisely  one  x  from  each  a  of  X.  Certainly,  if  one  permits  an  infinity 
of  choices,  one  can  show  that  there  is  such  a  set  of  x's  by  simply  choosing 
an  X  from  each  a.  However,  there  seems  no  other  way  to  infer  the  existence 
of  such  a  set  of  re's  in  general. 

In  the  symbolic  logic,  our  basic  tool  for  inferring  the  existence  of  classes 
is  Axiom  scheme  12.  To  use  it  in  the  present  case,  one  must  find  a  stratified 
statement  F(x)  such  that  F(x)  is  true  if  and  only  if  x  is  in  some  a  of  X  and 
F(y)  is  false  for  all  other  y's  of  this  a.  For  an  arbitrary  X,  we  know  no  way 
to  write  down  such  a  F(x). 

Alternatively,  one  might  try  to  infer  the  existence  of  such  a  set  of  x's  by 
reductio  ad  absurdum,  but  so  far  no  one  has  succeeded  in  this. 

Thus  it  appears  that,  if  we  wish  generally  to  be  able  to  infer  the  existence 
of  such  a  set  of  re's,  we  must  state  an  axiom  to  that  effect.  Such  an  axiom 
would  take  the  form: 

{\)::{a):a  e\.   D   .a  9^  A:.(a,/3):Q:,/3  e  X.a  5^  /3.   D   . 

a  n  13  =  A.:  D  :.(Ey){a):a  e  X.  D  .(Eirc)..T  e  a  H  7. 

This  is  known  as  the  "axiom  of  choice." 

A  significant  consideration  concerns  the  cardinality  of  X.  Thus  if  the 
additional  hypothesis  X  e  Den  were  inserted  into  the  above  statement,  our 
mathematician  Y  would  then  accept  it  gladly,  whereas  he  would  not  agree 
to  it  at  all  in  its  present  form.  One  can  therefore  distinguish  between 
axioms  of  choice  of  different  cardinalities  by  putting  restrictions  on  the 
cardinality  of  X.    Specifically,  if  we  write 
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ND  for  \{{oi):a  eX.  D  .a  ^  A:.(a,/3):a,/3  eX.a  ^  (3.  D  .a  n  (3  =  A), 

then  we  can  express  the  axiom  of  choice  of  cardinaHty  n  by 
AxC(n)         for         (X):.Nc(X)  <  n.X  6  ND:  D 

:(Ey)(a):a  e  X.  D  .(Eia;).a:  e  a  r\  y. 
The  unrestricted  axiom  of  choice  would  be 

AxC        for        AxC(Nc(V)). 

In  ND,  the  N  stands  for  "nonnuU"  to  indicate  that  no  member  of  X  is 
A,  and  the  D  for  "disjoint"  to  indicate  that  the  members  of  X  are  disjoint, 
or  nonoverlapping. 

There  are  a  large  number  of  useful  statements  which  are  equivalent  to 
the  general  axiom  of  choice.  We  shall  state  several  of  these  and  prove  the 
equivalences.  Actually,  each  one  of  these  statements  can  be  classified  as  to 
cardinality,  and  with  suitable  relations  between  the  various  cardinalities, 
various  implications  can  be  proved  between  them. 

We  shall  denote  these  various  statements  by  Z-,{n),  Zzin),  etc.,  when  they 
have  restricted  cardinality,  and  by  Zi,  Z2,  etc.,  when  they  have  unre- 
stricted cardinality. 

We  take  Zi(n)  to  be: 

*Tf  a  is  a  partially  ordered  set,  of  cardinality  less  than  or  equal  to  n,  such 
that  every  simply  ordered  subset  has  a  supremum  in  a,  then  there  is  a 
maximal  element  of  a." 

We  are  to  understand  this  as  follows,  a  is  a  partially  ordered  set  if 
(ER).a  =  AY (R).R  e  Pord.  If  a  =  Ay(R).R  e  Pord,  then  |8  is  a  simply 
ordered  subset  if  /3  ^  a  and  (l3]R\l3)  e  Sord;  an  upper  bound  of  a  subset  /3 
is  an  element  x  of  a  which  follows  all  members  of  (3,  that  is,  x  e  AV(i2) :(?/): 
y  e  13.  D  .yRx;  a  supremum  of  a  subset  ;S  is  a  least  upper  bound  of  (3,  that  is, 
re  is  a  supremum  of  /3  if  re  e  AY{R):(y):y  t  (3.  D  .yRx::(z):.z  e  AY (P):(y): 
y  e^.  D  .yRz:  D  -.xRz;  and  a  maximal  element  of  a  is  an  a;  such  that  x  max^ 
a.    We  define: 

Vh{^,R)  for         x{x  e  AY(R):(y):y  e  /3.  D  .yRx), 

Sup(/3,/2)  for        x{x  e  Vh(^,R):{z):z  e  Vh(l3,R).  D  .xRz), 

Max(|S,i^)         for         x{x  max^  /3). 

Then  Ub(/3,/^)  is  the  set  of  upper  bounds  of  /3  with  respect  to  R,  Sup(/3,i2) 
is  the  set  of  least  upper  bounds  or  suprema  of  /3  with  respect  to  R,  and 
Max(|S,i^)  is  the  set  of  maximal  elements  of  /3  with  respect  to  R.  Finally, 
we  write: 

Z,{n)  for  (a,i?)::Nc(a)    <  n.a  =  AY{R).R  €  Pord:.(/3):/3  C  a. 

{^\R\^)  €  Sord.  D  .Sup(^,J?)  9^  A.:  D  :.M3.x{a,R)  ^  A. 
Zi  for        Zi(Nc(V)). 
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We  take  Z^in)  to  be: 

"Any  set  of  cardinality  less  than  or  equal  to  n  can  be  well  ordered." 

That  is,  we  put : 

Z^in)         for         (a):Nc(a)  <  n.  D  .(EP).P  e  Word.a  =  AV(P). 
Z,  for         Z2(Nc(V)). 

We  take  ^3(71)  to  be: 

"If  X,  of  cardinality  less  than  or  equal  to  n,  is  a  set  of  nonempty  sets,  then 
there  is  a  function  /  such  that  for  any  a  in  X,  /(a)  is  a  member  of  a." 

In  order  to  preserve  stratification,  we  have  to  make  a  slight  variation, 
namely,  that  f{a)  is  the  unit  class  of  a  member  of  a.  This  in  no  way  dimin- 
ishes the  effectiveness  of  the  result.    Then  we  put: 

Z,in)  for  (X)::Nc(X)    <   n:.(ct):a  e  \.    D    .a   9^    A.:    D    -..(ER): 

R  e  Funct.Arg(/2)   =  X.Val(72)  Q  h(a):a  e  X.   D   .R(a)  C  a. 
Z3  for         Z3(Nc(V)). 

We  take  Zi(n)  to  be: 

"Let  X,  of  cardinality  less  than  or  equal  to  n,  be  a  set  of  sets  which  is 
partially  ordered  by  inclusion.  If  the  sum  of  each-  simply  ordered  subset  of 
X  is  in  X,  then  there  is  a  maximal  element  of  X." 

We  define: 


\n)  e  Sord.   D 


We  take  Zsin)  to  be: 

"Given  a  set,  of  cardinality  less  than  or  equal  to  n,  and  a  nonvacuous 
property  of  finite  character,  there  exists  a  maximal  subset  having  the 
property." 

A  property  of  sets  is  said  to  be  of  finite  character  if  a  set  has  the  property 
when  and  only  when  all  its  finite  subsets  have  the  property.  For  instance, 
one  can  state  the  result  given  in  Ex.XI.3.21  as: 

"A  Hausdorff  space  is  compact  if  and  only  if  a  C  CS.Pl"  5^  A  is  a 
property  of  a  of  finite  character." 

We  shall  define  FC  so  that  X  e  FC  if  and  only  if  a  e  X  is  a  property  of  a 
of  finite  character.    Thus 

FC  for  M(a):a  e  X.  =  .(SC(a)  H  Fin)  C  X). 

Then  we  put 

Z,(n)         for         (a,X):Nc(a:)  <  n.X  ^  A.X  e  FC.  D  . 

Max(X  r\  SC(a),C)  ^  A. 
Z5  for         Z,(Nc(V)). 


c 

for 

mc^  ^  ^), 

Z,(n) 

for 

(X)::Nc(X)    <    n:.(M):M    Q   Mn]    Q 
Um  eX.:  D  :.Max(X,CZ)  ^  A. 

z. 

for 

Z.(Nc(V)). 
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We  take  ZQ{n)  to  be: 

"Every  partially  ordered  set  of  cardinality  less  than  or  equal  to  n  contains 
a  maximal  simply  ordered  subset." 
That  is,  we  put: 

Z^{n)  for  (a,/?)::Nc(«)   <  n.a  =  AY(R).R  e  Pord.:   D   :.(E/3):. 

/3  Q  a.(0]R\l3)  e  Sord:(7):7  C  a.(y]R\y)  e  Sord. 

3  .~(i3  C  7). 
Ze  for         Ze(Nc(V)). 

We  take  ^7(71)  to  be: 

"If  a  is  a  partially  ordered  set,  of  cardinality  less  than  or  equal  to  n,  such 
that  every  simply  ordered  subset  has  an  upper  bound  in  a,  then  there  is  a 
maximal  element  of  a." 

That  is,  we  put : 

Z,(n)         for         (a,22)::Nc(a)   <  n.a  =  AY(R).R  e  Pord:.(^):^  C  a. 

(I3]R\(3)   e  Sord.    D    .Vh(0,R)    9^   A.:    D    ... 

Max(a,ff)  9^  A. 
Z,  for        Zr(Nc(V)). 

We  shall  now  prove  a  series  of  implications  between  AxC(n)  and  Zi(m) 
from  which  it  will  follow  that  AxC  and  each  of  Zi,  Z2,  Z3,  Z4,  Z5,  Z5,  and  Z^ 
are  equivalent.    Many  of  our  proofs,  as  well  as  the  statements  of  many  of 
the  Z's,  are  adapted  from  a  set  of  notes  by  G.  A.  Hedlund. 

Our  first  theorem  proves  a  result  needed  in  the  proof  of  our  second 
theorem. 

Theorem  XIV.1.1. 

(1)  «  =  AV(P), 

(2)  P  e  Ford, 

(3)  m^  Q  «.(^1^r/3)  6  Sord.  D  .Sup(/3,P)  ,^  A, 

(4)  R  e  Funct, 

(5)  Arg(P)  =  a, 

(6)  (x):X  e  a.   D   .xP(R(x)) 

yield 

(Ez).z  e  a.z  =  R(z). 

Proof.     Assume  the  hypotheses.    Define 

(7)  W  =  y(y  Q  a:.{y):y  e  y.  D  .R{y)  e  7:.(/3):/?  ^  y. 

(^]P\^)  e  Sord.  D  .Sup(/3,P)  C  y), 

(8)  M  =  nw, 

(9)  A  =  x(x  e  M:.(y):y  e  M.y(P  -  I)x.  D  .{R{y))Px). 
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Clearly  a  e  W,  so  that 

(10)                                                            M  Q  a, 

(11)                                                     Me  W, 

(12)                                                  AQM, 

(13)                                                     A  Q  a. 

Lemma  1.     {x,y):x  e  A.y  e  M.  D  .yPx.y/.{Rix))Py. 

Proof.     Assume 

(i)                                                       X  €  A 

and  define 

(ii)                                 B  =  y(ye  M:yPx.y.{R{x))Py). 

Clearly 

(iii)                                                   5  C  M, 

(iv)                                                    5  e  a. 

Let  y  e  B.    Then  ?/  e  M,  so  that  i?(?/)  e  M  by  (11)  and  (7). 
Case  1.     ?/P.r.?/  5^  x.    Then  {R{y))Px  by  (i)  and  (9). 
Case  2.     ?/Px.2/  =  x.    Then  P(a;)  =  R{y).    But  P(x)  e  AV(P)  by  (6),  so 
that  (R{x))P(R(y)). 

Case  3.     {R(x))Py.    We  get  yP(R(y))  by  (6),  so  that  {R(x))P(R{y)). 
Thus  {Riy))Px.yXR(x))P(R(y)),  so  that  P(?/)  e  5.    Thus  we  have 

(v)  (y):y  6  £.  D  .P(2/)  6  5. 

Let  )8  C  5.(/3li?t^)  e  Sord.    Then  Sup(/3,P)  e  il/  by  (iii),  (11),  and  (7). 

Case  1.  {z):z  e  /?.  D  .zPx.  Then  x  e  Ub(/3,P),  so  that  y  e  Sup(/3,P).  D  . 
^Px.    So  Sup(/3,P)  ^  P  by  (ii). 

Case  2.  (E^).^  e  jS.^izPx).  Then  (R(x))Pz  by  ^  e  /3,  /3  C  P,  and  (ii). 
But,  since  Sup(/3,P)  ^  Ub(^,P),  and  since  2  e  (S,  we  have  (?/):?/  e  Sup(/3,P). 
D  .sP?/.  But  from  (R(x))Pz  and  sP?/,  we  get  (R(x))Py.  So  (i/):?/  £ 
Sup(/3,P).  D  .(R(x))Py.    Then  Sup(^,P)  C  P  by  (ii). 

So  Sup(|8,P)  C  P  in  either  case.    Thus 

(vi)  (/3):^  C  P.(/3lPt/3)  e  Sord.  D  .Sup(i8,P)  C  P. 

Now  by  (iv),  (v),  (vi),  and  (7),  P  e  W.  Then  ilf  C  P  by  (8)  and  Thm. 
IX.5.5,  Part  L  Then  by  (iii),  B  =  M.  Then  by  (ii),  (y):y  e  M.  D  .yPx.y. 
{R{x))Py,  and  our  lemma  follows. 

Lemma  2.     (x,y):x  e  A.y  e  M.  D  .yPx.y.xPy. 

Proof.     If  {R{x))Py,  then  xPy,  since  xP{R(x))  by  (6). 
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Lemma  3.     {x)\x  e  A.  D  .R(x)  e  A. 
Proof.     Assume 

(i)  X  e  A. 

Then  x  e  M  by  (12),  so  that 

(ii)  R{x)  e  M 

by  (11)  and  (7). 

Assume  y  e  M.y{P  —  I)(R{x)).  By  (i)  and  Lemma  1,  yPx.y.(R(x))Py. 
However,  {R{x))Py  is  impossible  by  y{P  —  I){R{x)).    So  yPx. 

Case  \.  X9^  y.  Then  y{P  -  I)x.  Then  by  (i)  and  (9),  {R{y))Px.  But 
xP(R{x))  by  (6),  so  that  (R(y))PiR(x)). 

Case  2.     x  =  y.    Then  R(x)  =  R(y),  so  that  {R{y))P{Rix)). 

So  in  either  case,  (R{y))P{R(x)),  and  we  have  shown 

(iii)  (y):y  e  M.y{P  -  I){R{x)).  D  .(R(y))P(R{x)). 

Then  by  (ii),  (iii),  and  (9),  R(x)  e  A. 

Lemma  4.     (^):jS  C  A.{^]P\^)  e  Sord.  D  .Sup(j8,P)  C  A. 

Proof.     Assume 

(i)  13  C  A.(I3]P\IS)  €  Sord. 

Then  13  Q  M  so  that  by  (11)  and  (7), 
(ii)  Sup(,S,P)  C  M. 

Now  assume 

(iii)  X  e  Sup(/3,P), 

(iv)  ?/  €  ilf  .?/(P  -  /)a:. 

If  now  (z):z  e  (3.  D  .zPy,  then  y  e  Ub(/3,P),  so  that  xPy,  which  would 
contradict  (iv).    So  there  is  a  2  with 

(v)  z  e  13, 

(vi)  '^{zPy). 

By  (v)  and  (i),  s  e  A,  so  that  by  (iv)  and  Lemma  2,  zPy.y.yPz.  Thus 
?/P2:  by  (vi).  Also  y  ^  zhy  (vi),  so  that  y{P  —  I)z.  But  z  e  A,  so  that  by 
(9)  and  (iv),  {R{y))Pz.  But  by  (v)  and  (iii),  zPx.  So  (R(y))Px.  Thus  we 
have 

(vii)  (x,y):x  €  Sup(/3,P).2/  e  il/.t/(P  -  7)x.  D  .(R(y))Px. 

If  we  use  (ii)  and  (vii)  with  (9),  we  can  infer  Sup(/3,P)  C  A,  and  our 
lemma  is  proved. 
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Finally,  by  (13),  Lemma  3,  Lemma  4,  and  (7),  A  e  W.  Then  by  (8), 
ikf  C  J.,  so  that  by  (12),  yl  =  M.  Then  Lemma  2  gives  (x,y):x,y  e  M.  D  . 
yPx.y.xPy.    So 

(14)  (M]P\M)  e  Sord. 

Now  by  (11),  M  €  W,  and  certainly  M  Q  M.    Thus  by  (7), 

(15)  Sup(M,P)  QM, 
and  by  (3), 

(16)  Sup(M,P)  5^  A. 

Thenby(16),thereisa2eSup(ilf,P).  Then  by  (15),  z  eM.  SoR(z)eM 
by  (11)  and  (7).  Thus  {R{z))Pz,  since  z  e  Sup(ikf,P).  However,  zP{R{z))  by 
(6)  since  2  e  a  by  (10).    So  2;  =  R{z). 

Theorem  XIV.1.2.     \-  (5):AxC(Nc(USC(5))).  D  .Zi(Nc(5)). 

Proof.     Proof  by  reductio  ad  absurdum.    Assume 

(1)  AxC(Nc(USC(5))), 

(2)  Nc(a)  <  Nc(5), 

(3)  a  =  AV(P), 

(4)  P  6  Pord, 

(5)  (i8):i8  C  a.(/3lPf/3)  e  Sord.  D  .Sup(^,P)  9^  A, 

(6)  Max(a,P)  =  A. 
Define 

(7)  X  =  3(Ex).a:  e  a./3  =  yz{y  =  a:.a:(P  -  7)2). 
Clearly  ;S,7  e  X./?  n  7  5^  A.  D  .(S  =  7,  so  that 

(8)  (/3,7):/3,7  6  X.^  ?^  7.  3  ./S  n  7  =  A. 

Further,  let  ,5  e  X.  Then  x  e  a.^  =  yz{y  =  x.x{P  -  I)z).  Then  /3  =  A. 
D  .{z).^{x{P  -  I)z),  so  that,  if  /3  =  A,  then  a;  e  Max(a,P).  So  by  (6), 
/3  F^  A.    Thus 

(/3):/3  e  X.  D  .^  ?^  A, 
so  that  by  (8) 

(9)  X  e  ND. 

Also,  clearly  X  sm  USC(a),  so  that  by  (2),  Nc(X)  <  Nc(USC(5)).  Then 
by  (9)  and  (1),  there  is  an  S  with 

(10)  (|S):/3  e  X.  D  .(Eiw).w;  t  ^  r\  S. 
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From  this  we  infer  by  (7),  and  putting  R  =  S  r\  (P  —  1), 

(11)  Re  Funct, 

(12)  Axg{R)  =  a. 
Also, 

(13)  {x)'.x  €  a.  D  .x(P  -  I)(R(x)). 

Then  by  Thm.XIV.1.1  and  (3),  (4),  (5),  (11),  (12),  and  (13),  we  infer 
that  there  is  a  2  with  z  e  a.z  =  R(z).    This  contradicts  (13). 

CoroUary.     h  AxC  D  Z,. 

Theorem  XIV.1.3.     \-  in):n,n°  e  NC.AxC(n).  D  .Z,{n). 

Proof.  If  n,n°  e  NC,  then  by  Thm.XI.2.48,  Cor.  2,  there  is  a  5  with 
n  =  Nc(USC(5)).  If  now  Nc(a:)  <  n,  a  =  AV(P),  etc.,  then  a  is  similar  to 
a  subset  j8  of  USC(5),  and  then  j3  =  USC(7)  where  7  is  a  subset  of  5.  Now 
P  induces  on  7  a  partial  ordering  Q  with  properties  analogous  to  those 
assumed  for  P.  Then  we  can  put  7  and  Q  for  a.  and  P  in  Thm.XIV.1.2  to 
infer  that  7  has  a  maximal  element  relative  to  Q.  Then  the  corresponding 
element  of  a  is  maximal  with  respect  to  P. 

Theorem  XIV.1.4.     [-  (8):Z,(Nc{SC{8  X  5))).  D  .Z^CNciS)). 

Proof.     Assume 

(1)  Zi(Nc(SC(5  X  8))), 

(2)  Nc(a)  <  Nc(5). 
Then 

(3)  Nc(SC(a:  X  a))  <  Nc(SC(5  X  5)). 
Define 

(4)  A  =  P(P  e  Word.AV(P)  C  «). 

(5)  w  =  PQiP,Q  €  A:P  =  Q.y.{^x).x  6  AV(Q).P  =  seg.Q). 
Clearly 

(6)  A  =  AYiW). 

(7)  W  e  Ref. 
By  Thm.X.6.13,  corollary, 

(8)  W  e  Trans. 

Clearly  PWQ.  D  .No(P)  <  No(Q).  So  if  PWQ  and  QWP,  then 
No(P)  =  No(Q)  by  Thm.XII.3.7.  Then  P  smor  Q,  so  that  by  Thm. 
XII.2.1,  ^{Ex).x  e  AV(Q).P  =  seg^Q.    Then  from  PWQ,  we  get  P  =  Q.  So 

(9)  W  e  Antisym. 
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Lemma  1.     (/3):^  C  A.{^]W\^)  e  Sord.  D  .AV(UiS)  ^  «.U/3  e  Sord. 
Proof.     Assume 

(i)  /3  C  A, 

(ii)  (^1T^r/3)  6  Sord, 

and  put 

(iii)  5  =  U/3. 

Let  rr  e  AV(B). 

Case  1.  a:  €  Arg(5).  Then  (x,y)  e  5.  So  by  (iii),  there  is  a  P  e  jS  with 
{x,y)eP.  Then.r  €AV(P),sothatx€aand<a:,a:>  ePby  (i)  and  (4).  Then 
{x,x)  eBhy  (in).    Thus  xBx. 

Case  2.     x  e  Val(P).    Proceed  similarly. 

Thus 

(iv)  AV(5)  C  «, 

(v)  B  e  Ref. 

Let  xBy  and  ^^2.  Then  there  are  P  and  Q  with  P  e  /S.xPy  and  Q  e  jS.^/Q^. 
By  (ii),  PT^Q.v.QTFP. 

Case  1.  PI7Q.  Then  by  (5),  P  Q  Q.  So  xQ?/,  and  hence  xQz,  since 
Q  €  Trans  by  (4).    Then  xBz. 

Case  2.     QWP.    Proceed  similarly. 

Thus 


(vi) 

B  e  Trans. 

Similarly,  we  get 

(vii) 

B  e  Antisym. 

Let  x,y  e  AV(B).  Then  xBx  and  yBy  by  ^v^).  Then  P  e  0.xPx  and 
Qe^.yQy.    By  (ii),  PWQ.y. QWP. 

Casel.  PWQ.  Then  P  C  Q  by  (5).  ^o  xQx.  Thus.r,?/ e  AV(Q).  Thus 
xQy.y.yQx,  since  Q  e  Connex  by  (4).    Then  xBy.w.yBx. 

Case  2.     QWP.    Proceed  similarly. 

Thus 

(viii)  B  e  Connex. 

Lemma  2.  (lS,x,y):(3  C  A.(^]W\l3)  e  Sord.P  e  I3.y  e  AV(P).a:(U/3)?/. 
D  .a;P2/. 

Proo/.  Assume  the  hypothesis.  Since  x(\J^)y,  there  is  a  Q  with  Q  e  ^. 
xQ?/.    Then  PTFQ.v.QTFP. 

Case  1.     PWQ.P  =  Q.    Then  xP?/. 
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Case  2.  PWQ.P  9^  Q.  Then  by  (5),  w  e  AV(Q).P  =  seg„Q.  By  y  e 
AV(P),  we  have  y(Q  —  I)w.  Then  by  xQy,  we  get  x{Q  —  I)w.  So  by  the 
definition  of  seg„,Q,  we  get  x(seg^,Q)y.    That  is,  xPy. 

Case  3.     QTfP.    Then  by  (5),Q  Q  P.    Then  xPy. 

Lemma  3.     (/3):^  e  A.{l3]W\^)  e  Sord.  D  .Ui3  e  ^. 

Proof.  Assume  the  hypothesis.  By  (4)  and  Lemma  1,  it  suffices  to 
prove  U/3  e  Word.    Let 

(i)  B  =  U/3. 

If  now  7  n  AV(B)  9^  A,  then  there  is  an  a;  e  7  n  AY(B).  So  a:5a:.  Then 
P  e  /S.xPx.  So  7  n  AV(P)  ?^  A.  Thus  there  is  a  ?/  with  y  min^  7.  Then 
y  e  y  r\  AV(B).  Also  ?/  mins  7,  for  if  2  e  7  n  AV(B).z(B  —  I)y,  then 
z{P  —  I)y  by  Lemma  2,  contradicting  y  miup  7. 

Lemma  4.     (/3):^  C  ^.(/3lTf  t/3)  e  Sord.  3  .U/?  e  Ub(^,Tr). 

Proo/.  Assume  the  hypothesis  and  put  B  =  U/3.  By  (6)  and  Lemma  3, 
B  e  AV(PF). 

Let  P  e  /3.    Then  P  C  J5. 

Case  L     P  =  P.    Then  PWB  by  (5)  and  Lemma  3. 

Case  2.  P  9^  B.  Then  there  is  a  w  in  P  -  P  by  Thm.IX.4.2L  Then 
w  =  (x,y).  So  xBy.^ixPy).  Then  by  Lemma  2,  ^  y  e  AV(P).  Then 
AV(P)  -  AV(P)  ?^  A.  Let  z  be  the  least  member  of  AV(P)  -  AV(P)  with 
respect  to  B,  which  is  a  well-ordering  relation  by  Lemma  3.    So 

(i)  z  e  AV(B)  -  AV(P), 

(ii)  (w):w(B  -  I)z.  D  .w  e  AV(P). 

If  now  y(seg,P)w,  then  by  (ii),  v,w  e  AV(P).  As  vBw,  we  get  vPw  by 
Lemma  2.    So 

(iii)  seg,P  C  P. 

Conversely,  let  xPy.  Then  a:Pi/,  since  P  Q  B.  If  sP?/,  then  zPy  by 
Lemma  2.  But  zPy  gives  2  e  AV(P),  contrary  to  (i).  So  '^{zBy).  Then 
y(B  —  I)z.    Thus  also  x(B  —  I)z,  so  that  x(seg^B)y.    Thus 

(iv)  P  C  seg.P. 

Then  by  (i),  (iii),  (iv),  and  (5),  PWB. 

Lemma  5.     (/3):^  C  ^.(/S^I^tiS)  e  Sord.  D  .Sup (/3, IF)  5^  A. 

Proof.  Assume  the  hypothesis  and  put  B  =  \J^.  Then  B  e  Ub(,Q,IF) 
by  Lemma  4.    Now  suppose  P  e  Ub(/3,ir).    Then 

(i)  PeA 

by  (6)  and 

(ii)  (Q):Q  e  /?.  D  .QTTP. 
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Then  by  (5),  (Q):Q  e  ^.  D  .Q  Q  P.    Then 
(iii)  B  QP 

by  Thm.IX.5.8,  Part  II. 

Suppose  y  e  AY (B).xPy.  Then  yBy,  so  that  there  is  a  Q  e  ^  with  yQy. 
Then  by  (ii),  QWP. 

Case  1.     Q  =  P.    Then  xQy,  so  that  xBy. 

Case  2.  w  €  AV(P).Q  =  seg„P.  Then,  since  y  e  AY(Q),  we  have 
?/(P  —  I)w.  Then,  by  a;Py,  we  have  also  x{P  —  I)w.  So  x{segy,P)y.  Thus 
a:Q?/,  so  that  xBy. 

So  in  any  case  xBy.    Thus 

(iv)  {x,y).y  e  AY (B).xPy.  D  .a;5i/. 

Now  we  can  proceed  as  in  the  proof  of  Lemma  4,  using  (iv)  instead  of 
Lemma  2,  and  we  deduce  finally  BWP. 

Thus,  B  e  Sup(i3,IF). 

Lemma  6.     Max  (A,  IF)  5^  A. 

Proof.  By  (1),  (6),  (7),  (8),  (9),  and  Lemma  5,  we  need  only  prove 
Nc(A)  <  Nc(SC(5  X  5)).  However,  clearly  ^  C  SC(«  X  a),  by  (4),  so 
that  we  get  the  desired  inequality  by  (3). 

Lemma  7.     iP):P  e  Max(A,W).  D  .AV(P)  =  a. 

Proof.  Let  P  e  Max(^,IF)  and  AV(P)  ^  a.  Then  by  (4),  there  is  an 
x  e  a  with  ^  x  e  AV(P).  In  the  notation  of  Ex.XII.1.5,  put  Q  =  P  +, 
{(x,x)}.  By  P  e  M&x(A,W),  we  have  P  e  ^,  so  that  P  e  Word.  Then  by 
Ex.XII. 1.5(e),  we  get  Q  e  Word.  Also,  clearly  AV(Q)  ^  a.  So  Q  e  ^.  By 
Ex.XII.L5(g),  P  =  seg.Q.  So  by  (5),  PWQ.  Also,  clearly  P  9^  Q,  since 
X  e  AV(Q).~  X  e  AV(P).  Then  P(ir  -  I)Q,  contradicting  P  e  Max(A,II^), 
and  our  lemma  is  proved. 

By  Lemma  6,  there  is  a  P  e  Max(A,IF).  So  P  e  J.,  so  that  P  e  Word  by 
(4).    By  Lemma  7,  AV(P)  =  a,  and  our  theorem  is  proved. 

Corollary.     \-Z,D  Z^. 

Theorem  XIV.1.5.     \-  (n):n,ri°  e  NC.Zi(2"^").  D  .^^(n). 

Proof.  If  n,n°  e  NC  and  n  =  Nc(6),  then  n  X  n  =  Nc(5  X  5),  and 
Nc(SC(5  X  5))  =  2"^". 

We  now  derive  ^3(771)  from  ^2(71),  but  unfortunately  there  seems  to  be 
little  relationship  between  m  and  n.  Indeed  one  can  have  m  any^vhere 
between  0  and  2"^  apparently. 

Theorem  XIV.1.6.  [-  {n){Z^{n)y.  D  ::(X)::Nc(UX)  <  n:.(a):a  €  X.  3  . 
a  9^  A.:  D  :.(ER):R  e  Funct.Arg(P)  =  X.Val(P)  C  !:(«):«  e  X.  D  . 
R(a)  Qa). 

Proof.     Assume 
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(1) 

Z,{n), 

(2) 

Nc(UX)  <  n, 

(3) 

{a):a  e  X.   D   .a  7^  A. 

503 


Then  by  (2)  and  (1),  there  is  a  P  with 

(4)  P  e  Word, 

(5)  UX  =  AV(P). 
Define 

(6)  R  =  a3(a  e  X:(Ex).X  leastp  a.^  =    {x}). 
Clearly  by  (4)  and  Thm.X.6.14,  Part  III, 

(7)  R  e  Funct. 
Also  by  (3),  (4),  and  Thm.X.6.14,  Part  II, 

(8)  Aig(R)  =  X. 
Clearly 

(9)  Val(P)  C  1, 

(10)  {a):a  e  X.   D   .R(a)  Q  a. 

Corollary.     yZ^D  Z^. 

Theorem  XIV.1.7.     \-  (ji^.Z^.  =>  .AxC(n). 

Proof.     Assume 

(1)  Z,{n), 

(2)  Nc(X)  <  n, 

(3)  X  €  ND. 
By  (3)  we  have 

(4)  {a):a  e\.  D  .a  9^  A, 

(5)  (a,)8):a:,/3  e  X.a  5^  /3.  D  .a  A  /3  =  A. 
By  (1),  (2),  and  (4),  there  is  an  R  with 

(6)  R  €  Funct, 

(7)  Aig(R)  =  X, 

(8)  Ya\(R)  e  1, 

(9)  {a):a  e  X.  D  .R(a)  Q  a. 
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Now  put 

(10)  7  =  U(Val(/2)). 

If  now  a  6  X,  then  R(a)  e  7  by  (10),  R(a)  5^  A  by  (8),  and  a  r\  Rio)  = 
R{a)  by  (9).    So  a  i^  y  ^  A.    Thus 

(11)  (Ex).x  e  a  r\y. 

Now  let  y,z  €  a  r\  7.  Then  y  e  w.w  e  Val(i2)  and  z  e  v.v  e  Ya\{R).  Then 
w  =  i^(/3)andy  =  i?(7)-  So  2/ t  i?(/3)  and  2  e  ^^(7).  By  (9),  ?/ e /3  and  2;  e  7. 
But  y,z  ea.  So  a  r\  13  9^  A  and  a  n  7  5^  A.  So  by  (5),  a  =  /3  and  a  =  7. 
So  y,z  e  R{a).  However,  by  (8),  R(a)  =  {w}.  So  y  =  w  =  z,  and  thus 
y  =  z.    So  by  (11), 

(Eia:).a;  e  a  n  7. 

Corollary.     ^  Z^  D  AxC. 
We  have  now  proved 

[-  AxC  ^  Zi  ^  Z2  ^  Z3. 

Theorem  XIV.1.8.     K^):'^i(^)-  ^  -^4(w). 
Proof.     If  we  take  i?  to  be  X]  Q  \\,  then  clearly 

(1)  X  =  AY{R), 

(2)  /^  e  Pord. 

Also  ii  n  Q  X.Um  e  X,  then  {IJm}  =  S\ip{n,R).    So  we  get  our  theorem 
by  taking  a  to  be  X  in  Zi(n). 
Corollary.     \- Z,  D  Z^. 

Theorem  XIV.1.9.     ^  (8):Z,(Nc{SC{8))).  D  .Z,(Nci8)). 
Proof.     Assume 

(1)  Z,(Nc(SC(5))), 

(2)  Nc(a)  <  Nc(5), 

(3)  ■  X  ?^  A, 

(4)  X  e  FC. 


Then  by  (4), 


(5) 


(/3):/3  e  X.  =  .(SC(/3)  n  Fin)  C  X. 


By  (2), 


(6) 


Nc(X  r\  SC(a))  <  Nc(SC(5)). 
By  (3),  there  is  a  jS  e  X.    Now  A  e  (SC(/3)  n  Fin).    So  by  (5), 
(7)  A  €  X.     , 
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Lemma.     (/x):m  ^  (X  H  SC(a)).(/x1  ^  f/x)  e  Sord.  D  .Um  e  (X  n  SC(a)). 
Proof.     Assume 

(i)  M  C  (X  n  SC(«)), 

(ii)  (m1  C  [m)  e  Sord. 

Then  by  Thm.IX.5.8,  Part  II, 
(iii)  Um  e  SC(«). 

We  prove  by  weak  induction  on  n  that 
(iv)         (n,/3):.n  €  Nn./3  e  ?i./3  e  (J/z:  D  :(E72):72  e  Funct:Arg(72)  =  USC(^): 
YsX{R)  C  ix:{x):X  el3.  D  .X  e  R({x}). 

Now  let  /?  €  (SC(Um)  ^  Fin).    Then  by  (iv)  there  is  an  R  with 

(v)  R  e  Funct, 

(vi)  Arg(R)  =  USC(^), 

(vii)  Ya\{R)  Q  /x, 

(viii)  ix):x  el3.  D  .x  e  R({x}). 

Now  we  prove  by  weak  induction  on  n  that 

(ix)  in,R):n  e  Nn.i^  c  Funct. Arg(/?)  e  n.  D  .Val(i2)  e  Fin. 

Since  jS  e  Fin,  one  gets  YaliR)  e  Fin  by  (v),  (vi),  and  (ix). 
Case  1.     (3  =  A.    Then  ^  e  X  by  (7). 

Case  2.     /3  5^  A.    Then  Val(J?)  9^  A.    So  by  (ii)  and  Thm.XI.3.44,  Cor. 
3,  there  is  a  7  such  that 

(x)  7  greatest  s  Val(i2), 

where  we  write  S  for  n]  C  \n.    So  (?/):?/  e  Ya\{R).  D  .y  C.  y.    Then  by 
(viii),  {x):x  e  /3.  D  .x  e  7.    Thus  ,8^7.    Then  (3  e  (SC(7)  n  Fin).    However, 
7  €  X  by  (i)  and  (vii).    So  (SC(7)  r\  Fin)  C  X  by  (5).    Thus  ^  e  X. 
Thus  in  each  case  /S  e  X.    So  we  have  shown 

,8e(SC(UM)  nFin).  D  ./3  e  X. 

So  (SC(Um)  n  Fin)  C  X.    Then  by  (5), 

(xi)  Um  6  X. 

Then  by  (iii),  we  conclude  our  lemma. 

Then  by  (6)  and  our  lemma,  we  can  take  X  to  be  X  /?\  SC(q;)  in  Zi{n). 
Then  we  infer  Max(X  r\  SC(a),C)  ^^  A,  as  desired. 
Corollary.    \-Z^3  Z,. 
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Theorem  XIV.1.10.     \-  {n):n,n°  e  NC.Z4(2").  D  .Z^{n). 
Theorem  XIV.1.11.     \-  {n):Zs{n).  D  .ZM- 
Proof.     Assume 

(1)  Z,{n), 

(2)  Nc(a)  <  n, 

(3)  a  =  AY(R), 

(4)  R  e  Pord. 
Take 

(5)  X  =  3(/3  ^  a.((3]R\0)  €  Sord). 
Clearly  A  C  a.{A]R\A)  «  Sord.    So  A  e  X.    Thus 

(6)  X  ?^  A. 
Also,  it  is  clear  from  (5)  that 

(7)  (/3):^  €  X.  D  .(SC(/3)  n  Fin)  C  X. 

Let  (SC(/3)  r\  Fin)  C  X.  If  x  e  /3,  then  {x}  e  (SC(^)  H  Fin).  So  {x}  e  X. 
So  by  (5),  {x\  Q  a.    Then  x  e  a.    So 

(8)  (/3):(SC(/3)  n  Fin)  C  X.  D  ./3  C  «. 

Let  (SC(/3)  n  Fin)  C  X  and  x,y  e  AV(/3li2t/3)-  Then  x,y  e  0  r\  AY(R). 
So  {a:,?/}  e  (SC(/5)  H  Fin).  Then  {a:,?/}  e  X.  Thus  Q  e  Sord  if  we  write  Q 
for  ({x,y}]R\{x,y]).  But  by  Thm.X.6.1,  Part  VII,  AV(Q)  =  {x,y}.  Then 
xQy.y.yQx.    So  a;(^l/2f/3)2/.v.2/(/3l/2t/3)a;.    Thus  (jSli^ti^)  «  Connex.    So  by  (8), 

(^):(SC(/3)  n  Fin)  C  X.  3  .|S  €  X. 

Then  by  this  and  (7),  X  e  FC.    So  by  (1),  (2),  and  (6), 

Max(X  nSC(«),C)  ^  A. 

But  by  (5),  this  is  just  the  conclusion  desired. 
Corollary.     [-Z,  D  Z^. 
Theorem  XIV.1.12.     \-  (ny.ZM-  ^  Mn). 
Proof.     Assume 


(1) 

Z.(n), 

(2) 

Nc(a)  <  n, 

(3) 

a  =  AV(J?), 

(4) 

R  e  Pord, 

(5) 

(M  Q  «.(i8l-Rfi8)  €  Sord.  D  .Ub(^,72)  y^  A. 
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By  (1),  (2),  (3),  and  (4),  we  infer  that  there  is  a  /8  such  that 

(6)  /3  C  a.{^\R\fi)  e  Sord, 

(7)  (7):7  C  a.{y]Rb)  e  Sord.  D  .~(^  C  t). 

By  (5)  and  (6),  Ub (/?,/?)  ?£  A.  Take  x  6  Ub(/3,/2).  Then  x  e  Max(a,/2). 
For  if  not  then  there  is  a  ?/  with  x(R  —  I)y.  Then  (2):2:  e  /3.  D  .z(R  —  I)y. 
So  if  we  take  7  -■=  /3  W  {i/| ,  then  7  £  «,  (7l^f7)  «  Sord,  and  /3  C  7,  contra- 
dicting (7).    Then  Max(a,/2)  ?^  A. 

Corollary,     h  ^6  I)  ^7. 

Theorem  XIV.1.13.     \-  {n):Z,{n)  D  Zi(n). 

Proof.     Since  Sup(|S,/2)  C  Ub(/3,72),  the  theorem  follows  immediately. 

Corollary,     y  Z,  D  Z,. 

By  Thms.XIV.1.8  to  XIV.1.13,  we  have 

The  three  statements  AxC,  Z2,  and  Z3  were  considered  by  Zermelo  and 
proved  mutually  equivalent  (see  Zermelo,  1904;  and  Zermelo,  1908,  first 
paper).  The  various  names  "axiom  of  choice,"  ' 'Zermelo 's  axiom,"  "Zer- 
melo's  theorem,"  and  ''well-ordering  theorem"  are  to  be  found  attached 
to  various  of  AxC,  Z2,  and  Z3,  in  a  manner  that  is  far  from  uniform.  Actu- 
ally, since  the  three  statements  are  equivalent,  the  choice  of  names  is  not 
logically  significant,  although  it  may  be  important  on  historical  or  esthetic 
grounds. 

Since  1935,  when  Zorn  called  attention  to  the  considerable  usefulness  of 
Zi  (see  Zorn,  1935),  the  statements  Zi,  Z4,  Z5,  Zg,  and  Z^  have  come  into 
great  favor  as  alternatives  to  the  axiom  of  choice.  Although  Zorn  made  no 
claim  to  priority,  and  stated  only  Z4,  each  of  Z,,  Z4,  Z5,  Zg,  and  Z7  is  now 
known  by  the  name  of  "Zorn's  lemma."  Actually  Zs  was  stated  by 
Hausdorff  (see  Hausdorff,  1914,  page  140).  A  dual  form  of  Z4  was  stated 
by  R.  L.  Moore  (see  Moore,  1932,  page  84),  and  apparently  Z4  was  known 
to  Kuratowski  even  earlier.  Apparently  Z5  is  due  independently  to 
Teichmiiller  (see  Teichmliller,  1939)  and  Tukey  (see  Tukey,  1940,  page  7). 

Actually  Zj,  Z4,  Z5,  Zg,  and  Z7  are  not  all  the  statements  which  bear  the 
title  of  Zorn's  lemma.  Most  of  them  have  duals,  which  bear  the  same 
name  (see  Ex.XIV.1.1),  and  we  list  others  in  the  exercises  below.  Others 
are  still  being  proposed  from  time  to  time  (see  Wallace,  1944,  for  example). 

Besides  the  three  statements  originally  proposed  by  Zermelo,  and  the 
set  of  statements  known  as  Zorn's  lemmas,  there  are  many  other  equivalent 
statements.    Typical  of  such  statements  is 

(A)  {m,n)im,n  c  NC.  "D  .m  <  n.y.m  =  n.y.m  >  n 

(see  Ex.XII.4.9  and  Ex.XII.4.11).    A  considerable  number  of  statements 
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of  an  analogous  sort  are  given  by  Tarski  (see  Tajtelbaum-Tarski,  1924). 
The  motive  for  proving  that  such  statements  as  (A)  are  equivalent  to  the 
axiom  of  choice  is  not  to  increase  the  already  extensive  list  of  equivalent 
statements.  Rather,  it  is  somewhat  as  follows.  The  statement  (A)  is 
highly  desirable  in  the  arithmetic  of  transfinite  cardinals.  Thus  mathe- 
maticians who  are  not  sympathetic  to  the  axiom  of  choice  would  like  to 
derive  (A)  without  the  axiom  of  choice.  By  proving  that  (A)  is  equivalent 
to  the  axiom  of  choice,  one  shows  that  this  is  not  possible,  so  that  those 
mathematicians  who  wish  to  use  (A)  must  endorse  the  axiom  of  choice, 
however  reluctant  they  may  be  to  do  so.  A  recent  addition  to  the  list  of 
statements  of  this  sort,  and  in  quite  a  different  domain,  is  Tychonoff's 
theorem  (see  Lefschetz,  1942,  page  19).  A  proof  of  its  equivalence  with 
the  axiom  of  choice  has  been  given  in  Kelley,  1950. 

The  statement  Z2,  if  assumed,  apparently  makes  it  possible  to  make 
choices  in  any  conceivable  fashion.  For  if  we  assume  Z2,  then  we  can  well 
order  the  universe  V.  Then  given  any  nonempty  set,  we  can  always  specify 
that  the  "least"  element  of  it  be  taken.  Thus  there  is  a  completely  de- 
terminate member  of  each  nonempty  subset,  and  all  aspects  of  arbitrariness 
or  choosing  are  removed.  Indeed,  in  terms  of  the  relation  P,  which  well 
orders  V,  one  can  write  an  actual  formula  specifying  a  unique  member  fOr 
each  nonempty  set.  Then  a  definition  involving  any  sort  of  choices  can  be 
expressed  by  an  explicit  term  of  the  formal  logic. 

Thus  it  appears  that,  by  assuming  the  axiom  of  choice,  one  has  formal 
machinery  to  duplicate  any  known  intuitive  procedure  involving  choices. 

Some  attention  has  been  given  to  versions  of  the  axiom  of  choice  for  a 
set  X  of  nonempty  disjoint  sets  in  which  the  restriction  is  put  not  on  the 
cardinality  of  X  but  on  the  maximum  cardinality  of  members  of  X.  In 
this  connection,  see  Mostowski,  1945,  and  Szmielew,  1947.  Also,  van  Vleck 
makes  a  note  of  the  fact  that  his  construction  of  a  nonmeasurable  set  de- 
pends only  on  choices  from  a  X  whose  member  sets  all  have  two  elements 
(see  van  Vleck,  1908). 

We  mentioned  earlier  that  there  are  c  analytic  functions  at  each  point 
2o  and  c  points,  so  that  there  is  a  temptation  to  conclude  that  there  are 
c  X  c  analytic  functions.  However,  in  the  proposed  correspondence  we  are 
counting  each  function  many  times,  because  it  gets  counted  at  least  once  at 
each  point  where  it  is  analytic,  and  if  it  is  many-valued  it  gets  counted 
more  than  once  at  each  point  where  it  is  analytic.  Indeed,  our  count  of  c 
analytic  functions  at  a  single  point  counted  different  branches  of  a  given 
function  as  distinct  functions.  That  is,  each  point  and  convergent  power 
series  defines  a  branch  of  an  analytic  function,  and  hence  an  anal>i;ic 
function,  so  that  we  can  construct  a  function  whose  arguments  are  pairs 
{z,S),  where  2  is  a  point  and  >S  is  a  series,  and  whose  values  are  analytic 
functions.    We  then  merely  appeal  to  the  principle  that 
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{R):R  €  Funct.  3  .Nc(Val(/2))  <  Nc(Arg(i2)). 

However,  no  proof  of  this  is  known  that  does  not  use  the  axiom  of  choice. 
(For  a  proof  using  the  axiom  of  choice,  use  Ex.XIV.1.8,  below.) 

EXERCISES 

XIV.1.1.  We  mean  by  an  infimum  of  a  set  ^  with  respect  to  R  merely  the 
supremum  of  /5  with  respect  to  R.  We  mean  by  the  dual  of  Zj  {n)  the  state- 
ment: 

''If  a  is  a  partially  ordered  set,  of  cardinal  number  less  than  or  equal  to  n, 
such  that  every  simply  ordered  subset  has  an  infimum  in  a,  then  there  is  a 
minimal  element  of  a." 

Indicate  why  the  dual  of  Z^{n)  is  obviously  equivalent  to  Zi(n).  List 
those  of  Zi(n),  Zi{n),  Z^in),  Ze{n),  and  Z^(n)  which  have  duals,  state  the 
duals,  and  indicate  why  each  statement  is  obviously  equivalent  to  its  dual. 

XIV.1.2.     TakeZsCn)  tobe: 

"Let  X,  of  cardinality  less  than  or  equal  to  n,  be  a  set  of  sets  which  is 
partially  ordered  by  inclusion.  If  the  sum  of  each  simply  ordered  subset 
of  X  is  included  in  some  member  of  X,  then  there  is  a  maximal  element  of 
X." 

State  Zs{n)  symbolically,  and  prove: 

(a)  h  (n):Z,(n).  D  .Z^in). 

(b)  h  (n):Zs(n).  D  .Z,{n). 

(c)  h  Zs(Nc(V))  ^  AxC. 
XIV.1.3.     TakeZ9(n)  tobe: 

"Given  a  set  a,  of  cardinality  less  than  or  equal  to  n,  a  property  X  of 
finite  character,  and  a  subset  ^3  of  a  with  the  property  X,  then  there  is  a 
maximal  subset  of  a  containing  /3  and  having  the  property  X." 

Proved  (n):Z,{n).  =  .Z^(n). 

(Hint.     To  go  from  left  to  right,  take 

A  =  i(E7).7  Q  a.y  \J  ^  e  \.x  e  y. 

If  /3  C  a.^  e  X,  then  clearly  ^  Q  A,s®  that  (y):y  Q  A.  D  .y^  ^  Q  A.  Now 
take  a  to  be  A  in  Z^.  If  7  e  Max(X  r\  SC(^),  Q),  then  7  W  ;S  C  A,  so  that 
|8  C  7,  since  7  is  maximal  in  A.  Now  7  is  maximal  in  a,  for  if  7  C  5  and 
5  e  (X  n  SC(a)),  then  8  Q  A  because  13  Q  8.  To  go  from  right  to  left,  take 
0  to  be  A.) 

XIV.1.4.  Define  a  Zio(n)  which  generalizes  ZQ(n)  in  the  same  way  that 
^9(71)  generalizes  Z^in)  and  prove  |-  (n):Zio(n).  =  .Z^{n).  {Hint.  Take 
Z,o(n)  to  be  (ACl)  as  stated  on  page  42  of  Birkhoff,  1948.) 

XIV.1.5.  Define  Z]i(n)  and  Z^^in)  to  be  variants  of  Z^{n)  and  Z7(n) 
such  as  given  on  page  7  of  Tukey,  1940.  Find  the  numerical  relations 
between  pairs  of  m,  n,  and  p  that  are  needed  as  hypotheses  for  implications 
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between  corresponding  pairs  of  Zr,{m),  Zn(w),  and  ZizCp).  {Bint.  Use  the 
proofs  indicated  on  page  8  of  Tukey,  1940.) 

XIV.  1.6.  Prove  that  AxC(n)  and  Z,(w)  all  hold  for  finite  n,  indicating 
in  which  cases  the  additional  hypothesis  n  5^  0  is  required.  {Hint.  Prove 
Y  (n):7i  e  Nn.  D  .^3(71)  by  weak  induction,  and  derive  the  other  cases 
from  this.) 

XIV.1.7.     Prove  \-  AxC:  =  :(R)(ES).S  e  Funct.Arg(i2)  =  Arg(>S).5  C  R. 

XIV.1.8.  Prove  [-  AxC:  ^  :(R):R  e  Funct.  D  .(E5).5  e  l-l.Yal{R)  = 
Yal(,S).*S  Q  R. 

XIV.1.9.  Prove  [-  (6):AxC(Nc(USC(5))).  D  .Z3(Nc(5)).  (Fm^.  Given 
X,defineM  =  R(Ea).ae\.R  =  {ex}  X  USC(«),  and  applyAxC(Nc(USC(5))) 

to  fX.) 

XIV.1.10.  Prove  h  (5):Z3(Nc(SC(5))).  D  .Z,(Nc{8)).  (Hint.  If  Nc(a) 
<  Nc(5),  use  Z3  to  give  an  R  with  i^  e  Funct. Arg(i?)  =  SC(a)  -  0. 
Yal{R)  C  l:(/3):/3  C  a./3  ?^  A.  D  .7?(^)  C  ^.  By  Ex.XII.3.6,  there  is  an 
A{(j))  such  that 

h  (<^):<^  e  NO.   D   .A(</>)  =   tt/  ({y}    =  ^(«  -    {^W  I  d  <  <!>]))■ 
By  Ex.XII.3.7,  we  see  that  there  must  be  a </>  such  that  a  =  {A(e)  \  6  <  ({)}. 
Then  for  ^'s  less  than  the  least  such  (f),  A{d)  induces  a  well  ordering  of  d.) 
XIV.1.11.     Derive  Thm.XIV.1.1  by  utihzing  "definition  by  transfinite 
induction."    (Hint.     By  Ex.XII.3.8,  there  is  an  A(4>)  such  that 

h  (0):.</)  e  NO.'-^  (Ed).d  e  N0.</>  =   0  +0  lo:   ^   :^(«^)   =   iV  (V  ^  Sup({A(e)  \ 
d<ci>},P)) 

\-  (ey.e  e  no.  d  .A(d  +„  lo)  =  R(A(d)). 

Now  use  Ex.XII.3.9.) 

XIV.1.12.     Show  how  Thm.XII.2.11  can  be  derived  from  Thm.XIV.l.l. 

2.  How  Indispensable  Is  the  Axiom  of  Choice?  In  topology,  the  axiom 
of  choice  is  assumed  from  the  very  start,  and  uses  of  it  or  of  equivalent 
statements  are  frequent,  and  often  tacit.  There  is  little  evidence  at  the 
present  time  that  any  significant  portion  of  topology  can  be  derived  without 
the  use  of  the  axiom  of  choice,  or  at  least  of  some  form  of  it  of  restricted 
cardinality.  Moreover,  the  question  of  how  much  topology  could  be  done 
without  the  axiom  of  choice  is  apparently  receiving  no  attention  at  all. 

In  algebra,  quite  the  reverse  is  true.  Some  operations  in  algebra,  such  as 
making  an  infinite  number  of  extensions  of  a  field,  are  very  awkward  to 
perform  without  the  axiom  of  choice.  Certainly,  one  sacrifices  much  gen- 
erality in  algebra  by  not  using  the  axiom  of  choice.  Nevertheless,  much 
can  be  done  without  it,  and  algebraists  are  inclined  to  proceed  as  far  as 
possible  without  it.    For  a  discussion  of  this  point,  see  Teichmiiller,  1939. 
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In  analysis,  one  can  proceed  quite  a  way  without  the  axiom  of  choice. 
If  one  starts  with  the  theory  of  nonnegative  integers  given  in  Chapter  XI, 
one  can  proceed  as  in  Landau,  1930,  to  develop  a  theory  of  real  and  complex 
numbers  without  the  axiom  of  choice.  One  can  then  proceed  through 
Hardy,  1947,  to  develop  calculus  in  a  rigorous  fashion.  Hardy  does  use  the 
axiom  of  choice  at  a  very  few  places,  in  each  of  which  it  could  be  avoided 
by  some  device  such  as  making  the  required  choices  from  among  the  ra- 
tional real  numbers,  which  are  denumerable,  and  hence  well  ordered. 
Likely  Hardy  was  aware  of  this,  but  he  does  not  give  any  indication.  One 
can  then  start  into  Titchmarsh,  1939,  which  is  based  on  an  earlier  edition 
of  Hardy,  1947.  There  are  occasional  uses  of  the  axiom  of  choice.  Thus 
on  page  13,  for  each  n  one  takes  an  a:„  at  which  8{x)  —  Sn{x)  attains  its 
maximum.  However,  since  s{x)  —  Sn{x)  is  continuous  for  each  n,  one  can 
perfectly  well  specify  x„  uniquely  as  the  leftmost  point  at  which  s(x)  —  Sn(x) 
attains  its  maximum,  and  then  a:„  is  defined  as  a  function  value  of  n,  and 
the  axiom  of  choice  is  avoided. 

One  can  thus  proceed  without  the  axiom  of  choice  through  quite  consid- 
erable portions  of  the  theory  of  complex  variables  and  other  important  and 
useful  theories.  Indeed,  the  first  apparently  unavoidable  use  in  Titchmarsh, 
1939,  of  the  axiom  of  choice  occurs  in  Sec.  10.25  on  page  326,  in  the  proof 
of  the  first  fundamental  theorem  of  Lebesgue  measure.  Here  one  picks  an 
open  set  0„  for  each  set  E„  of  a  sequence  of  measurable  sets.  There  seems 
no  way  to  specify  0„  uniquely,  so  that  the  proof  fails  unless  one  is  permitted 
to  use  the  denumerable  axiom  of  choice.  We  know  of  no  other  proof  of  the 
theorem  which  will  proceed  without  the  denumerable  axiom  of  choice. 
That  is,  to  the  best  of  our  knowledge,  one  cannot  prove  the  first  fundamental 
theorem  of  Lebesgue  measure  without  an  appeal  to  the  denumerable  axiom 
of  choice. 

Two  more  explicit  uses  of  the  denumerable  axiom  of  choice  occur  in 
Titchmarsh's  development  of  the  theory  of  Lebesgue  measure,  namely  on 
page  329  and  page  369  of  Titchmarsh,  1939.  It  is  thus  open  to  grave  doubt 
that  one  can  develop  the  theory  of  Lebesgue  measure  without  use  of  the 
denumerable  axiom  of  choice. 

Clearly,  if  one  progresses  into  modern  analysis,  with  its  study  of  Banach 
and  other  spaces  and  its  extensive  use  of  topological  ideas,  it  is  no  longer 
possible  to  escape  the  axiom  of  choice,  because  it  is  so  basic  in  all  the 
topological  developments.  Even  so,  in  the  special  spaces  of  analysis  one 
can  sometimes  devise  special  proofs  without  the  axiom  of  choice  for  results 
which  require  its  use  in  general  spaces,  and  there  is  a  definite  amount  of 
interest  in  this  sort  of  thing  (see  Barsotti,  1947,  for  example). 

For  an  illustration  near  at  hand,  consider  the  theorem  that,  if  a  Hausdorff 
space  satisfies  the  second  countability  axiom,  then  every  countably  compact 
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subset  is  compact.  In  general,  the  proof  seems  to  require  the  denumerable 
axiom  of  choice  (see  the  next  section).  If,  however,  one  takes  the  real  line 
as  the  space  in  question,  the  theorem  specializes  to  the  Heine-Borel  theorem, 
which  can  be  proved  without  the  axiom  of  choice  (see  Sec.  6  of  Chapter  XI). 
In  subjects,  such  as  the  analytic  theory  of  numbers,  which  use  a  lot  of 
analysis  of  the  classical  sort  which  is  independent  of  measure  theory,  it  is 
likely  that  one  can  prove  all  the  basic  results  (including  most  of  the  deep 
results)  without  the  axiom  of  choice.  However,  until  a  very  careful 
scrutiny  has  been  made  of  all  the  proofs,  this  is  merely  a  plausible  conjec- 
ture. 

3.  The  Denumerable  Axiom  of  Choice.  Many  mathematicians  do  not 
object  particularly  to  the  axiom  of  choice  itself  but  do  object  to  some  of  its 
consequences.  Thus  one  can  find  mathematicians  who  find  arbitrary  well 
ordering  repugnant  but  feel  quite  sympathetic  toward  AxC.  In  view  of  the 
fact  that  |-  AxC  =  Za,  this  attitude  seems  quite  silly  at  first  glance.  How- 
ever, if  we  look  at  results  such  as  Thm.XIV.1.4  or  Ex.XIV.1.10,  it  begins 
to  appear  that  the  attitude  is  defensible.  Apparently,  to  prove  ^2(71),  one 
requires  AxC(m)  for  a  much  larger  rn,  say  m  of  the  order  of  magnitude  of  2". 
For  instance,  one  could  apparently  assume  AxC(c)  without  the  consequence 
that  the  continuum  can  be  well  ordered.  Actually,  one  can  apparently 
assume  AxC(w)  for  quite  a  large  n  without  entailing  well  ordering.  For 
instance,  one  might  assume 

{a,P):a  =  AV(P).P  e  Word.  D  .AxC(Nc(a)). 

That  is,  speaking  loosely,  we  would  only  permit  choices  to  be  made  in  a 
well-ordered  fashion.  This  assumption  would  permit  making  a  very  large 
number  of  choices  (at  least  Nc(NO),  for  example),  but  would  apparently 
not  enable  us  to  well  order  any  class  which  we  could  not  well  order  without 
the  assumption. 

A  particularly  appealing  case  of  this  sort  is  the  denumerable  axiom  of 
choice,  namely,  AxC  (Den).  Many  mathematicians  find  this  far  less  objec- 
tionable than  any  axiom  of  choice  of  greater  cardinality.  Also  one  can 
derive  a  considerable  amount  of  very  useful  mathematics  from  the  denu- 
merable axiom  of  choice  (notably  the  theory  of  Lebesgue  measure) ,  whereas 
a  great  increase  in  cardinality  seems  to  be  required  to  get  a  further  consider- 
able increase  in  the  number  of  provable  theorems.  Consequently,  we  shall 
assume  the  denumerable  axiom  of  choice  as  one  of  our  axioms. 

Axiom  scheme  15.  The  following  statement,  and  each  statement  got 
from  it  by  prefixing  some  set  of  universal  quantifiers,  is  an  axiom: 

AxC  (Den). 

A  more  useful  form  is  given  in  the  following  theorem,  which  is  just  the 
Principle  C  enunciated  by  Teichmiiller. 
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**Theorem   XIV.3.1.     h   (^):-^    «   Den:(a):a    e   X.    D    .«    5^    A:    D    :(E/^): 
R  e  Funct.Arg(7?)  =  Nn:Val(/2)  C  UX:(a):«  eX.  D  .(Em).m  e  Nn./2(m)  e  a. 
Proof.     Assume 

(1)  Nn  sm,s  X, 

(2)  {a):a  e  \.   D   .a  ^  A. 
By  Thm.XI.4.12,  there  is  a  T  such  that 

(3)  USC(Nn)  smr  Nn. 
Define 

(4)  fj,=  {{m}  X  S(Ti{m}))  \  m  e  Nn}. 
Clearly 

(5)  (/3):/3  efi.D  .0  9^  A, 

(6)  (/3,7):/3,7  e  M.^  5^  7.   ^   .^  n  7  =   A. 

Also  clearly  USC(Nn)  sm  ju,  so  that  ^  e  Den.    Then  by  Axiom  scheme  15, 
there  is  a  P  such  that 

(7)  (/3):^  6  M.  =)  .(Eix).a^  6  ^  n  P. 


Taking  i?  to  be  P  Pi  (Um),  we  easily  prove  that 


R  e  Funct, 
Arg(/2)  =  Nn, 
Val(P)  Q  UX, 
(q:):q:  e  \,  D  .(Em).m  €  Nn.i?(m)  e  a. 

This  theorem  is  similar  to  Z3(Den),  but  more  convenient. 

We  now  prove  the  well-known  result  that,  if  one  has  a  denumerable  class 
of  denumerable  classes,  then  the  totality  of  all  their  elements  is  likewise 
denumerable.  Let  us  have  a  class  X  with  members  uq,  a^,  a2,  .  .  .  ,  such  that 
each  a  is  denumerable.  In  general,  there  is  not  a  unique  method  of  enumer- 
ating each  a,  but  there  are  many  enumerations.  So  for  each  «„  we  must 
"choose"  an  ordering,  a:o,„,  a:i,„,  X2.„,  ....  This  is  where  we  require  the 
denumerable  axiom  of  choice  unless  we  can  devise  some  means  of  specifying 
a  unique  ordering  for  each  a.  In  the  proof  of  Thm.XI.4.18  we  specified  a 
unique  ordering  for  each  a;  indeed  W(n)  was  a  function  which  enumerated 
a„.  However,  in  cases  where  no  one  knows  how  to  specify  a  unique  ordering 
for  each  a,  we  must  appeal  to  the  denumerable  axiom  of  choice. 

Theorem  XIV.3.2. 
**I.  h  (X):>^  e  Den.X  Q  Den.  D  .\J\  e  Den. 

II.  \-  (X):X  e  Den.X  C  Count.  D  .(JX  c  Count. 
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Proof  of  Part  I. 

Let 

(1) 

X  e  Den, 

(2) 

X  C  Den. 

Put 

fx  =  {i^(Nn  sm/e  a)  \  a  e  X}. 
By  (2), 

{P).P  e  M.   ^   .^  ?^  A. 

Then  by  Thm.XIV.3.1,  there  is  an  S  with 

S  e  Funct, 
Arg(;S)  =  Nn, 
(m):m  e  Nn.  D  .S{m)  e  l-l.Arg(AS(m))  =  Nn.Val(ASf(w))  e  X, 
(a):a  6  X.  D  .(Em).w  e  Nn.;S(m)  e  l-l.Arg(S{m))  =  Nn.Val(>S(w))  =  a. 

Then  clearly 

X  =  {Val(5(m))  I  weNn}. 

Also  Val(5(0))  e  Den.    Then  by  Thm.XI.4.17,  Part  II,  IJX  e  Den. 
Proof  of  Part  XL     Similar,  except  that  we  take 

/x  =   {^(E^)./3  C  Nn./S  snifl  a  |  a  e  X}. 

Theorem  XIV.3.3.     h  («):«  €  Infin.  ^  .Den  <  Nc(a). 
Proof.     By  Thm.XI.4.6,  Cor.  1,  we  easily  go  from  right  to  left.    Now 
assume  a  t  Infin.    Put 

X  =  {SC(a)  r\n\  neNn}. 

By  Ex.XI.3.16,  (b),  (/3):/3  e  X.  D  ./3  5^  A.  Also,  clearly  X  e  Den.  So  by 
Thm.XIV.3.1,  there  is  an  R  such  that 

R  €  Funct, 
ATg(R)  =  Nn, 

(1)  (m):m  e  Nn.  D  .R(m)  Q  a.(En).R(m)  e  n.n  e  Nn. 

(2)  (n):n  €  Nn.  D  .(Em).m  e  Nn.i?(w)  e  w. 

Then  Ya\(R)  e  Den.Val(/^)  C  Count.  Then  UVal(i2)  e  Count  by  Thm. 
XIV.3.2,  Part  II.  Also  UVal(i2)  e  Infin  by  (2)  and  Ex.XI.3.16(b).  Thus 
UVal(12)  e  Den.    As  UVal(/e)  C  «  by  (1),  we  get  Den  <  Nc(a). 

Corollary,     j-  (a):.a  e  Infin:  ^  :(E;S)./3  C  a.i3  sm  a. 

Proof.     Use  Exs.XI  .4. 14  and  XI  .4. 15. 

Thus  a  class  is  infinite  if  and  only  if  it  is  similar  to  a  proper  subset  of  itself. 
This  is  often  taken  as  the  definition  of  an  infinite  class. 
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We  now  prove  that  any  denumerable  set  of  ordinals  less  than  12  has  an 
upper  bound  less  than  fl. 

Theorem  XIV.3.4.  \-  («)::«  e  Den:(0):^  e  a.  D  .6  <  12.:  3  :.(E<^):0  <  Q: 
{d):d  ea.  D   .6  <<j>. 

Proof.    Assume  a  e  Den  and  (d):6  e  a.  D  .6  <  12.    Then 

(1)  aQe{d<  12). 

By  Thm.X.6.14,  Part  II,  there  is  a  least  0  such  that  a  Q  6(6  <  <!)) .  Since 
0  is  the  least  such,  we  certainly  get 

(2)  Ui^(^  <  yp)\  rPea}   =   6(6  <  0). 

Since  a  e  Den,  we  get  [6(6  <  \l^)  \  4/  e  a}  e  Den,  and  by  Thm.XII.4.5, 
Part  III,  and  (1),  [6(6  <  ^jy)  \  xjy  e  a}  Q  Count.  So  by  Thm.XIV.3.2,  Part 
ll,U{kd<  ^)\  ^  eoc}  e  Count.  Then  by  (2),  6(9  <  0)  e  Count.  So  by 
Thm.XII.4.5,  Part  IV,  0  <  12. 

The  denumerable  axiom  of  choice  suffices  for  the  proof  of  the  well-known 
properties  of  Lebesgue  measure  (see  Titchmarsh,  1939,  Chapters  X,  XI, 
and  XII).  However,  all  known  proofs  of  the  existence  of  nonmeasurable 
sets  require  the  use  of  an  axiom  of  choice  of  higher  cardinality.  Thus  one 
can  make  an  assumption  which  is  strong  enough  to  give  the  usual  properties 
of  measure  without  entailing  the  existence  of  nonmeasurable  sets  as  far  as 
is  known.  This  may  be  of  some  comfort  to  those  mathematicians  who  find 
nonmeasurable  sets  distasteful.  However,  they  should  not  rely  too  strongly 
on  this,  because  recently  the  existence  of  a  nonmeasurable  set  has  been 
proved  from  an  assumption  which  could  possibly  turn  out  to  be  weaker 
than  the  axiom  of  choice  (see  Sierpinski,  1947). 

If  a  Hausdorff  space  satisfies  the  second  countability  axiom,  then  it  is 
separable.  For  it  has  a  countable  set  of  neighborhoods,  and  upon  picking 
a  point  from  each  neighborhood  (which  Thm.XIV.3.1  entitles  us  to  do)  we 
have  a  denumerable,  everywhere  dense  subset. 

We  likewise  get  the  theorem  that,  if  a  space  satisfies  the  second  counta- 
bility axiom  and  is  countably  compact,  then  it  is  compact.  Let  A''o,  A''], 
N2,  ...  be  the  countable  set  of  neighborhoods  and  suppose  that  every 
infinite  set  has  a  limit  point.  Then  every  closed  set  is  compact.  For  let  a 
be  a  closed  set.  Choose  N^,  the  first  N  having  a  point  in  common  with  a. 
Choose  N a„+,  the  first  N  which  has  a  point  in  common  with  a  —  (A,,,  W 
iVa,  W  •  •  •  W  Nan).  This  is  merely  a  definition  by  strong  induction.  Then 
there  must  be  an  n  such  that  {iV„,,  .  .  .  ,  N^J  covers  a.  For  if  not,  then 
pick  a  point  x  from  each  of  a,  a  —  iV„.,  a  —  (N„^  w  N^J,  ....  The 
limit  of  this  set  of  x's  cannot  be  in  any  A„..,  and  we  have  a  contradiction. 

Many  times,  the  axiom  of  choice  is  used  in  a  proof  when  it  can  be  avoided. 
An  example  is  a  proof  commonly  given  of  Thm.IX.8.11,  that  every  open 
set  is  a  sum  of  neighborhoods,  which  runs  as  follows. 
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"Let  a  be  an  open  set.  Then  for  each  x  in  a,  there  is  a  neighborhood  A''^ 
containing  x  and  contained  in  a.  Choose  such  an  A^^  for  each  x  in  a.  Then 
a  is  clearly  the  sum  of  all  such  A^^." 

Here  one  must  use  an  axiom  of  choice  of  the  cardinality  of  USC(q:). 
However,  the  axiom  of  choice  is  not  needed  because  there  is  no  reason  to 
have  a  unique  N j,  for  each  x.  One  can  just  as  well  take  for  each  x  the  set  of 
all  N^  included  in  a.  As  this  set  is  uniquely  defined  by  x,  we  do  not  need  to 
appeal  to  the  axiom  of  choice.  In  our  proof  of  Thm.IX.8.11,  we  proceeded 
in  this  manner. 

As  another  illustration  of  a  proof  with  and  without  the  axiom  of  choice, 
consider  the  theorem : 

If  a  is  a  nonempty  closed  set  in  the  plane  and  a;  is  a  point,  then  there  is  a 
point  of  a  which  is  nearest  x;  that  is,  a  point  y  in  a  such  that  the  distance 
from  x  to  y  is  less  than  or  equal  to  the  distance  from  x  to  any  point  of  a. 

Proof  with  the  Axiom  of  Choice.  Take  d  to  be  the  greatest  lower  bound 
of  distances  from  x  to  points  of  a,  so  that  for  each  positive  s  there  is  a 
point  z'm  a.  with  \x  —  z\  <  d  -\-  z.    Now  choose  z^,  z^,  .  .  .  with 

\x-  z^\  <d  +  -  -    .  - 

Then  the  set  {zi,  22,  •  •  -1  has  a  limit  point,  which  is  the  desired  y. 
Proof  without  the  Axiom  of  Choice.     Take  d  as  above.    Define 

Sn  =  ziz  e  a.\  X  —  Z  \   <  d  -{-  - 
\  n 

Then  each  *S„  is  closed,  and  the  product  of  any  finite  number  of  *S's  is  non- 
empty. Then  the  product  of  all  the  /S's  is  nonempty  by  Ex.XI.3.21,  since 
Si  is  a  compact  Hausdorff  space  by  the  Heine-Borel  theorem.  Then  take  y 
to  be  any  member  of  the  product  of  all  the  S's. 

In  the  proof  using  the  axiom  of  choice,  we  picked  a  unique  z^  from  each 
Sn.  As  this  was  not  really  necessary  in  the  present  case,  the  proof  could 
just  as  well  be  carried  out  without  the  axiom  of  choice. 

Clearly  it  is  immaterial  in  each  proof  whether  a;  is  in  a  or  not.  If  a;  is  in  a, 
then  y  turns  out  to  be  x  itself,  of  course. 

EXERCISES 

XIV.3.1.     Prove  \-  {ii):n  e  NC.  D  .n  <  Den.y.n  =  Den.v.n  >  Den. 

XIV.3.2.  Prove  with  and  without  the  axiom  of  choice  that,  if  a  and  0 
are  two  closed  nonempty  sets  in  the  plane,  then  there  is  a  point  x  in  a  and  a 
point  ?/  in  /3  such  that  the  distance  between  x  and  y  is  less  than  or  equal  to 
the  distance  between  any  other  two  points  of  which  one  is  in  a  and  one 
is  in  ^. 


CHAPTER  XV 
WE  REST  OUR  CASE 

We  stated  at  the  beginning  that  it  was  our  aim  to  provide  a  formal 
symbohc  logic  which  would  be  adequate  for  the  types  of  intuitive  reasoning 
used  by  mathematicians  in  their  mathematical  thinking.  We  disclaim  any 
adequacy  for  reasoning  in  nonmathematical  fields,  but  we  do  claim  that 
we  have  accomplished  our  aim  as  far  as  mathematical  reasoning  is  con- 
cerned. Although  divergence  of  opinion  among  mathematicians  themselves 
makes  it  impossible  for  us  to  adopt  a  procedure  with  respect  to  the  axiom  of 
choice  which  will  satisfy  all,  we  have  so  arranged  matters  that  different 
opinions  in  this  respect  are  readily  accommodated  within  the  framework 
which  we  have  set  up.  The  crucial  question  is:  ''Aside  from  the  clearly 
indicated  indeterminanc}^  with  regard  to  the  axiom  of  choice,  is  our  frame- 
work sufficient?" 

We  believe  it  is,  and  that  a  careful  and  attentive  reader  will  have  been 
convinced  by  now  that  it  is.  We  recognize  that  there  cannot  be  complete 
assurance  that  we  have  indeed  listed  all  essential  principles  unless  we 
should  continue  from  the  foundations  we  have  laid  and  develop  all  mathe- 
matics. The  reasons  against  our  undertaking  such  a  course  are  too  obvious 
to  mention,  and  we  propose  instead  that  we  now  refer  the  reader  to  a  suc- 
cession of  carefully  written  texts  which  are  already  in  existence  and  which 
will  carry  the  reader  well  into  the  main  stream  of  mathematics. 

Although  these  texts  are  carefully  written,  one  will  seldom  find  explicit 
references  in  them  to  logical  points.  This  is  right  and  proper.  In  the  main 
body  of  mathematics,  a  writer  should  proceed  in  a  logically  sound  manner, 
but  he  should  not  be  preoccupied  with  logical  points.  Such  a  writer  should 
be  fully  competent  in  logic  and  should  so  phrase  his  proofs  that  it  is  clear 
that  the  logical  niceties  could  be  supplied  if  they  were  called  for.  However, 
he  should  assume  similar  competence  on  the  part  of  his  readers  and  should 
focus  attention  on  the  mathematical  difficulties  and  developments.  Logical 
points  should  be  mentioned  only  in  case  there  is  a  genuine  logical  difficulty, 
and  routine  logical  points  should  be  entirety  taken  for  granted. 

In  the  main,  mathematics  is  written  in  this  fashion.  There  are  to  be 
found  cases  in  which,  through  ignorance  or  carelessness,  proofs  are  pre- 
sented in  which  it  would  be  difficult  to  supply  the  logical  details.  There 
are  also  to  be  found  cases  in  which  an  author  intrudes  quite  commonplace 
logical  points  into  completely  mathematical  contexts  as  though  he  expected 
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them  to  cause  difficulty  for  his  readers.  Both  extremes  should  be  avoided, 
and  usually  are. 

We  have  developed  the  beginnings  of  a  theory  of  nonnegative  integers. 
Omitting  zero  gives  the  positive  integers,  from  which  one  can  construct  in 
succession  the  positive  rationals,  the  positive  reals,  the  reals,  and  finally 
the  complex  numbers.  Quite  adequate  details  are  given  in  Landau,  1930. 
Fuller  details  are  given  in  Kershner  and  Wilcox,  who  introduce  sets  of 
axioms  for  each  new  kind  of  number  and  use  the  Landau  constructions 
only  to  prove  the  consistency*  of  their  successive  sets  of  axioms.  The  rela- 
tive advantages  of  the  alternative  modes  of  development  are  discussed  in 
Chapter  21  of  Kershner  and  Wilcox. 

Landau  concerns  himself  exclusively  with  mathematical  matters.  The 
logic  needed  for  his  proofs  is  assumed  and  used  without  comment.  Nonethe- 
less, his  proofs  are  set  forth  with  great  care  (except  for  one  or  two  inexplica- 
ble slips)  and  it  would  be  merely  a  routine  exercise  in  logic  to  justify  all  his 
proofs  on  the  basis  of  our  axioms  and  theorems.  Kershner  and  Wilcox  are 
much  more  preoccupied  with  logical  matters  (in  spite  of  their  statement  on 
page  17  that  they  accept  the  whole  of  logic  as  a  fundamental  undefined 
notion)  so  that  their  proofs  come  closer  to  putting  the  logical  principles  in 
evidence  than  do  Landau's  proofs.  Doubtless  this  is  helpful  to  any  who  try 
to  read  their  text  without  having  first  acquired  an  adequate  logical  back- 
ground, but  we  do  not  believe  that  it  makes  their  proofs  any  more  precise 
or  easy  to  render  into  symbolic  logic  than  those  of  Landau. 

The  reader  may  be  disturbed  to  find  Kershner  and  Wilcox  using  the 
axiom  of  choice  in  the  proofs  of  several  of  their  theorems  (including  some 
which  do  not  require  its  use) .  However,  the  theorems  for  which  they  have 
made  use  of  the  axiom  of  choice  are  not  needed  for  their  main  developments, 
and  they  have  apparently  been  careful  always  to  designate  such  theorems  as 
depending  on  the  axiom  of  choice. 

If  the  reader  is  planning  to  read  either  Landau,  1930,  or  Kershner  and 
Wilcox,  we  venture  to  suggest  one  minor  improvement.     It  is  taken  for 

1  We  have  said  repeatedly  that  no  proof  is  known  that  our  symboUc  logic  is  con- 
sistent. How  then  can  we  prove  the  consistency  of  a  set  of  axioms  for  some  branch  of 
mathematics?  In  an  absolute  sense,  we  cannot.  Nevertheless,  we  can  prove  consistency 
in  a  significant  fashion.  For  instance,  by  taking  the  /,  1,  and  tr  of  Kershner  and  Wilcox, 
p.  104,  to  be  Nn  —  jO),  1,  and  Xx(x  +  1),  respectively,  we  can  prove  within  the  symbolic 
logic  that  the  Peano  axioms  (page  104  of  Kershner  and  Wilcox)  are  consistent.  This 
gives  us  no  positive  assurance  that  these  axioms  are  indeed  consistent,  since  we  do  not 
know  that  the  symbolic  logic  is  consistent.  However,  we  have  gained  significant  infor- 
mation, to  the  effect  that  the  Peano  axioms  are  at  least  as  consistent  as  our  symbolic 
logic.  That  is,  if  our  symbolic  logic  does  not  contain  a  contradiction  before  we  add  the 
Peano  axioms,  then  no  contradiction  will  be  introduced  if  we  decide  to  assume  the 
Peano  axioms. 
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granted  in  both  texts  that  one  cannot  justify  definition  by  induction  until 
one  has  some  of  the  properties  of  < .  This  is  not  so.  Given  merely  Peano's 
axioms  (see  Landau,  1930,  page  2,  or  Kershner  and  Wilcox,  page  104),  one 
can  prove  the  special  cases  of  our  Thm.XI.3.23  and  Thm.XI.3.24  with 
w  =  1  by  almost  precisely  the  proofs  we  have  given  for  general  m.  If  this 
were  done,  it  would  simplify  the  definitions  of  +,  X,  and  Z  for  Landau  (see 
Landau,  1930,  pages  4,  14,  and  115)  and  of  +  and  X  for  Kershner  and 
Wilcox  (see  Kershner  and  Wilcox,  pages  109,  116). 

There  seems  to  be  a  belief  that,  once  one  has  the  real  number  system, 
one  can  proceed  into  calculus  and  analysis  with  no  further  ado.  This  is 
perhaps  true  of  the  theory  of  functions  of  a  single  real  variable,  but  for 
functions  of  a  complex  variable  some  geometric  ideas  seem  indispensable 
for  such  results  as  Cauchy's  integral  theorem.  So  perhaps  one  should  pro- 
ceed next  to  a  careful  treatment  of  geometry.  One  can  "construct"  geome- 
try by  the  Cartesian  method,  by  defining  points  as  ordered  triples  (x,y,z)  of 
real  numbers,  then  lines  as  sets  of  points  defined  by  appropriately  chosen 
parametric  equations  with  a  single  parameter,  betweenness  on  a  line  by 
means  of  the  parameter  on  the  line,  planes  by  parametric  equations  with 
two  parameters,  distance  and  metric  properties  in  the  obvious  fashion,  etc. 
Then  the  familiar  geometric  theorems  are  forthcoming.  Alternatively,  one 
can  use  the  construction  indicated  above  to  prove  the  consistency  of  some 
set  of  axioms  for  geometry,  such  as  those  given  in  Veblen,  1904,  and  then 
base  geometry  on  these  axioms.  What  is  most  needed  perhaps  in  calculus, 
complex  variable  theory,  and  other  branches  of  analysis  is  the  result  that 
the  Cartesian  construction  indicated  above  does  satisfy  the  axioms,  and 
hence  the  theorems,  of  Euclidean  geometry.  It  also  seems  necessary  to 
have  the  Jordan  curve  theorem,  at  least  for  polygons.  The  proof  of  the 
Jordan  curve  theorem  for  polygons  as  sketched  on  pages  267  to  269  of 
Courant  and  Robbins  can  be  carried  out  rigorously  (though  much  less 
briefly)  in  the  symbolic  logic  on  the  basis  either  of  the  Cartesian  construc- 
tion or  of  some  such  set  of  geometric  axioms  as  that  given  in  Veblen,  1904. 

Once  the  real  number  system  and  a  modicum  of  geometry  are  available, 
one  can  start  on  analysis.  As  a  foundation,  including  a  rigorous  treatment 
of  calculus,  Hardy,  1947,  is  excellent  and  fits  rather  well  onto  the  end  of 
Landau  or  Kershner  and  Wilcox.  There  is  not  much  overlap,  and  most  of 
the  gaps  come  from  Hardy's  assumption  of  simple  geometric  properties. 
If  one  takes  pains  to  supply  the  necessary  geometry,  then  there  are  almost 
no  gaps.  Typical  of  the  few  remaining  gaps  is  the  assumption  of  the 
binomial  theorem  on  page  142  of  Hardy,  1947.  This  gap  is  easily  filled,  for 
instance  by  our  Ex.XI.3. 11(d).  For  purposes  of  exposition,  Hardy  treats 
some  topics  out  of  their  logical  order,  but  the  logical  order  is  easily  restored. 
A  painstaking  reader  can  find  (and  correct)  a  few  minor  slips,  but  on  the 


520  LOGIC  FOR  MATHEMATICIANS  [Chap.  XV 

whole  the  book  is  very  carefully  written,  and  the  proofs  are  readily  trans- 
latable into  symbolic  logic. 

Hardy  consistently  uses  the  word  "function"  in  our  sense  of  "function 
value"  and  has  no  word  for  "function"  in  our  sense.  As  he  deals  mainly  with 
particular  functions  or  very  special  classes  of  functions,  this  does  not  cause 
him  noticeable  inconvenience.  In  many  cases  in  which  a  careful  verbal 
description  would  almost  certainly  require  a  name  for  our  concept  of 
"function,"  Hardy  instead  resorts  to  the  use  of  formulas  involving  the 
familiar  notation,  j(x).  We  would  not  say  that  Hardy's  treatment  of 
functions  and  function  values  is  inadequate  except  as  far  as  it  succeeds  in 
diverting  attention  from  the  important  concept  of  a  function  (in  our  sense 
of  the  word) . 

After  completing  Hardy,  1947,  one  can  turn  to  Titchmarsh,  1939.  This 
fits  perfectly  onto  the  end  of  Hardy,  1947,  because  it  was  deliberately 
written  to  fit  onto  the  end  of  an  earlier  edition  of  Hardy,  1947. 

In  the  main,  Titchmarsh,  1939,  is  very  carefully  written,  and  the  proofs 
translate  readily  into  symbolic  logic.  A  hiatus  comes  in  the  theory  of 
curvilinear  integrals  of  a  complex  variable.  On  pages  74  to  79,  Titchmarsh 
gives  a  proof  of  the  Cauchy  integral  theorem  for  a  specially  limited  class  of 
contours  for  which  the  Jordan  curve  theorem  is  easily  proved.  His  proof 
does  not  have  an  adequate  treatment  of  the  question  of  orientation,  in  that 
he  does  not  prove  for  the  irregular  regions  (see  Titchmarsh,  1939,  page  76) 
that  traversing  them  in  counterclockwise  fashion  carries  one  in  the  proper 
direction  along  the  portion  of  the  curve  involved  in  the  irregular  region. 
However,  the  proof  is  readily  supplied,  and  the  real  hiatus  comes  on  page 
79  where  he  applies  the  Cauchy  integral  theorem  to  contours  more  general 
than  those  for  which  he  has  proved  it.  To  fill  this  gap  it  does  not  suffice 
merely  to  prove  the  Jordan  curve  theorem  in  full  generality,  since  two 
different  contours  from  Zq  to  z  may  cross  each  other  an  infinite  number  of 
times  and  in  a  most  unpleasantly  complicated  manner. 

Naturally,  the  gap  can  be  filled  by  first  going  into  a  full  treatment  of 
analysis  situs  (see  Moore,  1932,  for  instance).  However,  one  would  like  to 
avoid  this  if  possible,  and  we  propose  the  following  alternative. 

Let  us  first  prove  the  Jordan  curve  theorem  for  polygons  (see  above). 
Now,  since  the  polygon  is  a  closed  set  in  the  plane,  we  can  use  the  Heine- 
Borel  theorem  to  infer  that  it  can  be  covered  by  a  finite  number  of  interiors 
of  circles  with  the  following  properties.  Each  circle  together  with  its  interior 
touches  at  most  two  sides  of  the  polygon  and  touches  two  sides  only  in  case 
the  sides  have  a  common  vertex  which  is  in  the  interior  of  the  circle.  Using 
this  covering  of  circles  to  derive  the  necessary  orientability  conditions,  one 
can  now  prove  the  Cauchy  integral  theorem  for  polygons  by  the  procedure 
given  in  Sees.  2.33  and  2.34  of  Titchmarsh,  1939.  Then  by  induction  on  the 
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number  of  times  C  intersects  itself,  one  can  prove  that,  if  C  is  a  closed  curve 
consisting  of  a  finite  number  of  straight-line  segments  joined  end  to  end 
and  lies  inside  a  simply  connected  region  in  which  f(z)  is  analytic,  then 

f  f(z)  dz  =  0. 

J  c 

If  now,  in  Sec.  2.36  of  Titchmarsh,  1939,  we  restrict  the  contours  to  be 
curves  consisting  of  a  finite  number  of  straight-line  segments  joined  end 
to  end,  we  nonetheless  infer  that,  if /(2)  is  analytic  in  a  simply  connected 
region,  then  there  is  a  single-valued  function  F  defined  in  the  region  such 
that  F'{z)  =  f{z).  Next  one  proves  the  lemma  that,  if  an  arbitrary  g{z) 
is  analytic  and  has  a  continuous  derivative  in  the  region,  then  for  that  g(z) 

j     g'(z)  dz  =  g(z)  -  g{zo) 

for  any  rectifiable  path  lying  in  the  region  and  connecting  Zq  and  z.  Letting 
g{z)  be  F{z),  we  infer 

''  f(z)  dz  =  F(z)  -  F(zo) 


/: 


along  arbitrary  rectifiable  paths.    This  in  turn  immediately  gives  the  gen- 
eral result  that,  if  C  is  a  closed  rectifiable  curve  lying  in  the  region,  then 


/. 


f{z)  dz  =  0. 

This  amount  of  generality  easily  covers  most  uses  which  Titchmarsh 
makes  of  the  Cauchy  integral  theorem.  On  pages  100,  120-123,  145,  201- 
202,  and  284a,  Titchmarsh  makes  use  of  much  deeper  properties  of  curves, 
for  which  one  would  need  an  extensive  geometrical  development.  Fortu- 
nately, in  all  these  cases,  Titchmarsh  is  deriving  isolated  results  that  are 
not  used  elsewhere  in  his  text.  Thus,  for  most  of  Titchmarsh,  1939,  one 
can  manage  with  the  very  meager  geometrical  background  that  we  have 
indicated. 

Following  Hardy,  Titchmarsh  undertakes  to  use  "function"  in  our  sense 
of  "function  value"  and  to  dispense  with  any  designation  for  what  we  call 
a  "function."  Throughout  most  of  the  text,  he  succeeds  fairly  well,  but 
he  is  exceedingly  hampered  at  a  few  points,  notably  in  his  treatment  of 
analytic  continuation.  We  cannot  help  but  believe  that  the  student  would 
find  analytic  continuation  more  comprehensible  if  it  were  explained  in 
terms  of  /  rather  than  j{z). 

We  have  already  commented  on  the  apparently  unavoidable  use  of  the 
denumerable  axiom  of  choice  in  the  theory  of  Lebesgue  measure. 
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When  one  has  finished  Titchmarsh,  1939,  one  is  prepared  to  read  from  a 
wide  range  of  texts.  In  general,  they  are  written  with  less  painstaking 
attention  to  details  than  is  to  be  found  in  the  texts  which  we  have  cited. 
This  is  not  to  be  taken  as  an  indication  that  such  texts  are  not  carefully 
written.  Rather,  it  is  that  we  are  getting  into  a  domain  where  one  not  only 
takes  logic  for  granted,  but  also  much  of  the  mathematical  theory  so  meticu- 
lously explained  by  Landau,  Kershner  and  Wilcox,  Hardy,  and  Titchmarsh. 

In  the  usual  research  papers,  even  more  is  taken  for  granted,  so  that  such 
papers  are  commonly  unintelligible  except  to  experts  in  the  field.  Again, 
no  lack  of  rigor  is  necessarily  involved.  It  is  merely  that  in  any  text  it  is 
not  worth  while  supplying  details  which  the  readers  can  readily  supply. 
So,  when  writing  for  experts,  most  details  can  be  suppressed. 

How  can  we  be  sure  that,  among  all  the  logical  principles  implicitly 
used  in  such  writings,  there  do  not  appear  some  which  cannot  be  derived 
from  our  axioms?  Naturally,  we  cannot.  To  some  extent,  the  lack  of  any 
mention  of  logical  principles  by  the  authors  is  some  reassurance,  since  if 
they  were  aware  of  using  some  esoteric  logical  principle,  they  would  doubt- 
less record  that  fact. 

Actually,  the  question  of  the  adequacy  of  foundations  arises  as  much  for 
the  mathematical  foundations  as  for  the  logical  foundations.  With  many 
properties  of  real  numbers  being  used  implicitly,  how  can  we  be  reassured 
that  they  do  indeed  all  follow  from  some  such  basis  as  the  axioms  given  by 
Kershner  and  Wilcox?  Naturally,  we  cannot.  We  have  to  trust  that  the 
care  of  the  author  and  the  scrutiny  of  his  readers  suffice  to  detect  any 
results  which  do  not  follow  from  commonly  accepted  foundations. 

In  conclusion,  we  should  like  to  say  that  we  do  not  wish  to  suggest  that 
it  is  desirable  that  all  mathematical  proofs  can  or  should  be  carried  out 
solely  on  the  logical  basis  which  we  have  set  up.  Our  symbolic  logic  is  not 
intended  as  a  model  for  how  mathematicians  should  think  but  only  as  a 
model  of  how  at  the  present  time  they  do  indeed  think.  Indeed  it  is  desir- 
able that  new  and  more  potent  and  flexible  principles  of  reasoning  be 
devised  and  generally  accepted,  so  that  distant  portions  of  the  mathematical 
edifice  will  become  more  readily  accessible.  One  advantage  of  a  symbolic 
logic  is  that  it  can  be  made  very  precise,  but  an  even  greater  advantage  is 
that  it  can  be  changed  to  fit  the  circumstances. 
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